=Paper= {{Paper |id=Vol-273/paper-3 |storemode=property |title=Promotion of Ontological Comprehension: Exposing Terms and Metadata with Web 2.0 |pdfUrl=https://ceur-ws.org/Vol-273/paper_40.pdf |volume=Vol-273 |dblpUrl=https://dblp.org/rec/conf/www/GibsonWS07 }} ==Promotion of Ontological Comprehension: Exposing Terms and Metadata with Web 2.0== https://ceur-ws.org/Vol-273/paper_40.pdf
       Promotion of Ontological Comprehension: Exposing
               Terms and Metadata with Web 2.0
        Andrew Gibson                                   Katy Wolstencroft                                Robert Stevens

                                                       University of Manchester
                                             School of Computer Science, Kilburn Building,
                                                    Oxford Road, Manchester, UK
                                                     +44 161 275 0649
                                             andrew.p.gibson@manchester.ac.uk

ABSTRACT                                                              W3C 1 . This next generation Web promises to transform the
Knowledge artifacts that have been labeled as ontologies have         information Web into a machine computable utopia for
many different qualities and intended outcomes. This is               semantically described data and information. Despite the
particularly true of bio-ontologies where high demand has led to a    development of the technologies, there is, however, only little
rapid growth in the number of these artifacts. Good                   evidence of the materialization of the Semantic Web (or Webs).
communication between the human agents involved in the life           Simple RDFS vocabularies such as Friend of a Friend have
cycle of ontologies is essential for the ontologist to encode the     provided small views on the potential of the Semantic Web [9].
right knowledge in the ontology. Not only this, but it should be      Rich ontological views supported by reasoning have appeared in
encoded such that subsequent retrieval of the knowledge from the      applications [27, 30, 31], but less so in the Web itself, and when
ontology by any agent can be clear and precise. The ontologist        they do, they often represent unconnected niche pockets of
can encode ontological statements, for interpretation by a            interest.
computer agent, or meta-ontological statements, for interpretation    In contrast, Web 2.0 is in the here and now, in use by large
by human agents. We consider how the current communication            interconnected user communities, and is ever growing as more
between agents and ontologies produces drawbacks that add to the      people adopt and contribute to various community efforts. To try
considerable overheads associated with ontology development.          and specify Web 2.0 would almost be a contradiction in terms,
We describe the processes of communication between human              and restricting its users with strong recommendations would be
agents and ontologies as Ontology Comprehension. We then              seen as an attempt to unnecessarily limit the creativity of those
suggest how these processes could be augmented, particularly          who have something new to try. Taxonomies give way to
with the use of Web 2.0 ideas. By exposing and enhancing the          folksonomies, letting the user mark-up things lightly on the Web
social interactions involved in ontology comprehension,               rather than specify a typed URI. The technologies of Web 2.0
development overheads are potentially reduced and the prospect        were not specified; they evolved out of clear and present needs of
of ontology sharing and reuse is improved.                            users to connect with one another. The principles of Web 2.0
                                                                      grow out of a mixture of hindsight and insight to current practice,
Categories and Subject Descriptors                                    and revolve around online community building, quick and easy
I.2.4 [Artificial Intelligence]: Knowledge Representation             linking, unlimited customization in the hands of the masses. In
Formalisms and Methods – representations, representation              this article we use ‘Web 2.0’ to refer to these principles rather
languages.                                                            than any specific technology.
                                                                      It has not gone unnoticed however that the artifacts, such as
General Terms                                                         vocabularies and ontologies, that will support the Semantic Web
Design, Human Factors, Standardization, Languages.                    need populating [25, 26], and for this to happen, both the
                                                                      technology and the nature of ontology building need to be
                                                                      accessible to the masses. Similarly in the computer science view
Keywords                                                              of knowledge artifacts such as ontologies inherently have this
Ontologies, Semantic      Web,    Web    2.0,    OWL,    Ontology     community aspect—they are shared conceptualizations that aim to
Comprehension.                                                        enable both human and computational interoperation of diverse
                                                                      resources at a semantic level.
1. INTRODUCTION                                                       The simplicity and robustness of HTML fuelled the growth of the
The technologies of the Semantic Web [6] have been centrally          current Web, but the highly-specified nature of the technologies
conceived, specified and designed with recommendations by the         in the Semantic Web recommendations suggests that the semantic
                                                                      side of the development, delivered through ontologies, will be
                                                                      driven mostly by experts. In this way, it is key that somehow this
 Copyright is held by the author/owner(s).                            barrier of complexity is lowered through creating an easier user
 WWW 2007, May 8--12, 2007, Banff, Canada.


                                                                      1
                                                                          http://www.w3.org/2001/sw/
experience, and that the motivators that are driving Web 2.0 are      domains, and have the virtues of being sharable and reusable. As
harnessed to promote uptake of Semantic Web ideas.                    yet, it is difficult to find an ontology that could be said to have
In this paper we consider the social and communication                been designed to fit the criteria for enabling a Semantic Web by
dependent aspects of the ontology development life cycle, and         being domain general and rich in content. One prominent example
identify problems encountered by people with specific roles of        of an ontology approaching these criteria is the Foundational
interaction. From this, we suggest that a clear, layered separation   Model of Anatomy (FMA) [12, 23]. The FMA could be said to be
is made between statements in ontologies that are logical and         more of a true domain ontology (or reference ontology) than any
those that are linguistic, supporting annotations on the ontology.    other in bio-medicine. However, even the FMA has barriers to the
In doing so, the annotations can be exposed to the collaborative      Semantic Web goals of sharing and reuse because of its large size,
aspects of Web 2.0, promoting light discussion at the level of        perhaps because it was developed in Frames and later converted
natural language about the meanings of terms, whilst leaving the      to OWL.
heavier encoding of knowledge into OWL as a task for                  In computer science, what are called ontologies covers a broad
ontologists.                                                          range of knowledge artifacts. Glossaries, vocabularies, thesauri,
                                                                      informal and formal ontologies (both in language and ontological
2. ONTOLOGIES AND DEVELOPMENT                                         discrimination) are all used at various points in the Semantic
The central premise of the Semantic Web is enabling                   Web. Different levels of expressiveness (sometimes called
computational processing of Web resources through knowledge           formality) come from the purpose and demands of the ontology
artifacts. The W3C have provided the Resource Description             being developed [28]. These demands can be considered with
Framework (RDF) and the Web Ontology Language (OWL)                   increasing levels of expressiveness from very “light-weight” term
recommendations. The latter, particularly in its OWL-DL variant,      lists, thesauri, dictionaries or hierarchies up to “heavy-weight”
is offered as a means of building robust property based               with very expressive constraints [10, 25]. OWL-DL offers a
descriptions with a logical underpinning that can be used to          formal language and can be used to build rich, logical
provide vocabulary for describing Web content, but also support       representations of descriptions of what exists; it can also be used,
reasoning across Web content [20]. Such ontologies are to be the      in various forms, to develop other forms of knowledge artifact
semantic backbone for linking resources in the Semantic Web.          while still retaining strict language semantics in the
Additionally, these ontologies are to represent knowledge of          representation, but weakening the ontological distinctions made in




    Figure 1: Ontology Comprehension: Current model of interactions between various agents and an ontology, as described in
    Section 3. The human agents are not necessarily different individuals, but rather are separated here by the roles fulfilled in
                                           the development and inspection processes.
the knowledge artifact.                                               of the ontology development life cycle, the ontologist (assuming
                                                                      they have no domain knowledge) will usually rely on the domain
Building OWL-DL logic based ontologies is a difficult process
                                                                      expert to provide a core set of terms from the domain of interest
[21] and reaching a community consensus is hard, especially in
                                                                      as a starting point. The initial scope of the ontology, rather than
complex domains such as biology, where knowledge for making
                                                                      being rigidly defined, is often roughly determined from the initial
ontological distinctions can be incomplete. These issues need to
                                                                      term list and this will get refined as things move on. At this early
be addressed if ontologies are to play their role in the Semantic
                                                                      stage it is necessary for the domain expert to be able to quickly
Web. Here, we are mostly interested in the aspect of reaching a
                                                                      assess if the terms are appropriate. As things are, the easiest way
community consensus. Focus is often placed on the aspect of
                                                                      to do this is for the domain expert to be able to access the
collaborative ontology building, that is, a group of people
                                                                      ontology for themselves and browse the hierarchy of terms, whilst
working directly with one ontology. We do not aim to discuss this
                                                                      checking and adding in textual annotations for the terms, as well
type of system, as we see such systems as expert systems for
                                                                      as any comments about the specific or contextual use of any of
logic-savvy ontologists rather than currently being suitable for
                                                                      the terms.
“the masses”. Much more work needs to be done on enabling true
collaboration in logic based ontologies. Instead, we currently        The ontologist will be using one of the commonly available
envisage a core of expertise for logic encoding supported by          ontology development tools such as Protégé-OWL 2 , Swoop 3 and
people conceptualizing and gathering linguistic material. We          OBO-Edit 4 . All of these tools are centered on the user interacting
acknowledge that there is a wealth of methodologies that address      with a class hierarchy view, which the ontologist will be building
certain aspects of the ontology development lifecycle [10, 29] and    from the terms given to them by the domain expert. At this stage,
evaluation [8, 24], good reviews of these fields can be found in      the domain expert will primarily be concerned with having the
the references. For the purposes of this paper, we wish to focus on   correct term-definition pairs represented in the proto-ontology.
the social interactions during these processes rather than the        Decisions regarding the class hierarchy signal the beginning of a
processes themselves.                                                 slightly more complex level of expressivity, as the ontologist will
                                                                      be making assertions between classes about subsumption
                                                                      relationships [14]. This is especially true of OWL ontologies, and
3. ONTOLOGY COMPREHENSION                                             such decisions do not necessarily need to be considered for
We learn from the field of software engineering that effective        simpler controlled structured vocabularies in which hierarchical
reuse of elements of object oriented frameworks is reliant on         relationships “broader than” and “narrower than” are possible.
many levels of understanding from the point of view of the            The ontologist may also start to guide the domain expert in how
programmer [4, 15]. In software engineering, improving these          to transfer knowledge regarding some of the more fundamental
levels of understanding is known as “software comprehension”,         object properties such as part-hood.
and we extend the principles to ontology development. We
outline ontology comprehension as the interaction between human       At some point, the domain experts need to let the ontologists start
agents and the knowledge expressed in an ontology.                    to make even more expressive assertions in the ontology that they
                                                                      may not necessarily understand the implications of for
Figure 1 outlines the interactions between various agents and an      themselves. This signals the next stage of ontology development,
ontology that are considered in this section. There are two main      in which the balance shifts so that the ontologist starts to refine
modes in which ontology comprehension is important:                   the assertions in the ontology. Instead of being instructed and
     1.   Development mode. Ontology development requires             guided by the domain expert, the ontologist now needs to ask
          that there is efficient interaction between experts that    careful questions of the domain expert. The aim of these questions
          represent the knowledge of the domain in the scope of       should be to extract the intrinsic meaning of the terms that the
          the ontology (domain experts) and the ontologist that is    domain expert has provided so that the ontologist can encode
          responsible for the construction and continued              these meanings into the ontology using more and more expressive
          maintenance of the ontology. Here we assume a model         restrictions and axioms. Significantly, unless the domain expert
          where, for a specific ontology development exercise,        has had training in understanding the meanings of logical
          there is a limited cohort of domain experts that are        assertions of ontologies, they will still primarily rely on the
          involved with an ontologist.                                lexical annotations and definitions when evaluating the ontology.
     2.   Inspection mode. Ontology inspection is a light             Once the content of the ontology has begun to stabilize (i.e. there
          evaluative process that an agent will go through the        are fewer major revisions in the content of the ontology being
          ontology to quickly assess whether or not that ontology     made) it will be made available to a wider audience. This can
          is of good quality and whether what it contains is          signal a whole new critical process of revision for the ontology. In
          suitable for some specific needs of the inspector.          the next section we will consider what sort of interactions may
What follows is an outline of task models that highlight how          occur between different agents and ontologies when they are first
currently, the interactions of agents involved with ontologies        encountered.
leads to discrepancies in ontology comprehension.                     Eventually, the increase in the content of the ontology, both
                                                                      lexical and logical, should start to level off as the content and the
3.1 Task Model 1: Ontology Development                                intended scope, at which time further structural modifications
We consider early ontology development as a process that begins       may be made, such as modularization, which could happen once
with the lightest possible knowledge structure, essentially a term
list, and subsequently moves up through levels of complexity and      2
expressiveness of the types discussed in [10]. This happens               http://protege.stanford.edu/
                                                                      3
socially as well as in the ontology as all those involved in the          http://code.google.com/p/swoop/
development become more familiar with scope. At the beginning         4
                                                                          http://www.oboedit.org/
the micro-organization of the knowledge in a domain has become         It is hard not to liken an ontology inspection process to some sort
clear. A publicly available and relatively stable ontology has a       of evaluation. What we describe here is fairly close to ontology
new set of requirements, for which the topics of ontology              selection [24], except that ontology inspection is more of a
evolution and change management address [19]. Change                   browsing process, driven by what access there is to comparative
management of ontologies has been considered in a technological        information between several ontologies. Selection has much better
sense for some time, and it should be clear that changes to a          defined initial parameters for the desired outcome, and can give a
publicly available ontology need to be transparent. However,           more targeted outcome. We do not wish to label this inspection as
there is a growing trend for including extra hierarchical structures   an evaluation however, as we do not make the assumption that the
into the ontology that represent deprecated classes (e.g. [30]). The   inspector will be following any pre-determined criteria, and if
need to do this is obvious; it is less so how to do it neatly and      they are, that they are rational criteria.
ontologically. Versioning etc. are all parts of the ontology life-     The ontology inspection process is short lived, and for many
cycle that have no really, consistent support.                         people’s goals, the choice of beginning a new ontology that they
The following discrepancies in ontology comprehension should           know will satisfy their criteria is more favorable than editing an
be clear from this section.                                            existing one. However, such inspections can quickly be deemed
                                                                       fruitless when the term searched for turns out not to be defined by
     1.   Discrepancies in Early Development                           logical statements in an ontology. This is a common occurrence,
               a.   The most convenient means of constructing,         as such ‘classes’ can be placeholders for future development or
                    looking at and sharing the early term list is,     intrinsically defined terms where no logical definition was
                    unusually, from within an ontology file,           thought necessary. Ontologies can be intensely developed in one
                    which implies some hierarchical structure.         particular area where immediate goals are important, yet there is
                                                                       no way to effectively discover this other than through thorough
               b.   Early revisions of the ontology are                browsing. For the goals of the Semantic Web, it is imperative that
                    experimental for the ontologist, yet are still     such information required to carry out this inspection process be
                    subject to inspection and lexical evaluation       made as clear as possible for the inspector, such that we do not
                    from the domain expert.                            see immense reproduction of individual effort and no clear
                                                                       “shared conceptualizations”.
               c.   Domain experts, having looked directly at
                    revisions of the ontology file, may be resistant   The domain knowledge these ontologies describe can require a
                    to subsequent major changes in structure and       considerable amount of understanding for anyone trying to
                    terminology by the ontologist as knowledge is      inspect them. There are several ways in which this can be the
                    disambiguated.                                     case.
     2.   Emerging Discrepancies                                            1.    The domain knowledge encoded may be outside the
                                                                                  experience of the inspector, or in a different context to
               a.   Inclusion of information regarding deprecated                 what was expected. The inspector may not be able to
                    classes into the class hierarchy of the                       tell if the knowledge represented is valid because it is
                    ontology.                                                     not within their expertise, and will need to seek help
     3.   Communicative Discrepancies                                             and advice from a domain expert.
                                                                             2. The knowledge may be appropriate, but encoded with
               a.   Discussions between the domain experts                        axioms and restrictions that the inspector may not be
                    about terminology that are potentially crucial                able to accurately interpret as real world meaning, such
                    for ontology comprehension are lost or are                    that they have to find the advice of an ontologist.
                    completely separated from the ontology itself.           3. The ontology may have been written for a specific
                                                                                  purpose. The inspector may not be able to tell whether
               b.   Discussions between the domain experts and
                                                                                  this is the case, and could therefore assume that the first
                    the ontologists about disambiguation of terms
                                                                                  or second scenario above is true, unless it is possible to
                    are lost or are completely separated from the
                                                                                  seek advice from the original authors or find a resource
                    ontology itself.
                                                                                  containing this information.
               c.   Potential for misinterpretation of logical         The three scenarios above are serious issues for the future of
                    aspects of the ontology by the domain experts      ontologies in the Semantic Web. Most ontologies are developed
                    through exposure to the logical component.         as part of projects, and projects are usually pragmatic in terms of
                                                                       their goals. Hence, people build these ontologies as application
                                                                       ontologies that serve the immediate needs of the project. There is
3.2 Task Model 2: Ontology Inspection                                  no perceivable immediate benefit for a project to develop a more
Ontologies are complex entities. If any ontology is going to get       general domain ontology in tandem with an application ontology,
used by someone other than the person or group that implemented        and so it does not happen. Consequently the Semantic Web goals
it, there has to be a way in which it can be decided whether or not    of sharing and reuse become much harder, as people will tend to
it is an appropriate ontology for the task in hand [18]. Currently,    assess these application specific ontologies as too specific for a
this inspection process is difficult because of the paucity of         new purpose, as they see that they will need to invest effort in its
ontologies available, and the fact that many have been designed        re-engineering. Another danger here is that with so many
for a specific purpose. Also, the discrepancies listed in 3.1 result   application ontologies being developed, that inspectors always
in a general lack of information that can aid effective inspection     start to assume that unusual features of ontologies are the result of
and overall ontology comprehension.                                    the needs of an application, and dismiss the ontology as
potentially unusable. What is really needed is for the inspector to                    b.    No indication without exploration of the level
be sure what sort of artifact they are looking at by having easy                             of effort put into different areas of an
access to certain parameters.                                                                ontology.
In the Semantic Web vision, the first course of action for an
ontologist would be to verify the existence or non-existence of a       4. DESIDERATA FOR SEMANTIC
domain ontology with close or overlapping scope to the ontology         ONTOLOGY COMPREHENSION
they are to develop. This process will be laborious if it relies on     Section 3 highlights the social and communicative discrepancies
the current practice of downloading ontologies and browsing them        that prevent an effective amount of ontology comprehension that
to see if they are at all reusable. In response to this, technologies   is required for the uptake of the Semantic Web goals of ontology
such as Swoogle [16] and AKTiveRank [2] are starting to provide         sharing and reuse. This section cross-analyses these discrepancies
access to online ontologies through page ranking and other              to produce some desiderata that can be considered for future
analytical methods to establish potential target ontologies.            systems. Whilst all types of data in and about an ontology may be
However, these technologies have been criticized for ignoring the       considered ‘ontological’, we specify ‘Meta-Ontological Data’,
meaning of concepts and also relations [24]. Furthermore, we note       ‘Ontological Metadata’ and ‘Logical Statements’ as clearly
that the results returning from these searches are whole OWL            identifiable parts. For information contained within ontology files
files, free and independent of contextual information. For              that is only for human interpretation of the encoded semantic
example, a Swoogle search for “Protein” has in its top hits an          content, we use the idea of Meta-Ontological data. For data
ontology used in an educational tutorial (that in this case is          specific to an individual ontology that is necessary for interpret
evident from its URL), which is by no means intended to be a            and inspection across the whole structure and history of
shared or reusable resource, but none the less is discovered and        development, we use the idea of Ontology Metadata. The ‘logical
accessible.                                                             statements’ in an ontology constitute the remainder of the content.
Those inspecting ontologies can find themselves in an isolated
situation where Web searches and personal inspection of an              4.1 Separating the Ontological from the
ontology or its documentation are the only means to ontology            Meta-Ontological
comprehension. It has already been recognized that the Web has
                                                                        Ontologies come with a considerable amount of meta-ontological
enormous potential for social organization and engagement. In
                                                                        information (or should do so) which is used by the human to
ontology comprehension, for example, it offers the means of
                                                                        assess and see the intended use of that ontology. Much of this
asking those who know. It also, as Wikipedia has shown, offers
                                                                        meta-ontological information is linguistically orientated. These
the means by which elements that aid ontology comprehension
                                                                        meta-ontological extensions to the ontology itself are meaningless
can be developed. Having concluded the need for ontology
                                                                        strings to the computer, and in this respect are unnecessary in so
comprehension, we now explore what is necessary for such a
                                                                        far as the computational goals of the Semantic Web are
facility.
                                                                        concerned. We know that this meta-ontological information is
The following discrepancies in ontology comprehension should            necessary, but we also see that it is not convenient to access; it
be clear from this section:                                             lacks the human resources that often make the most of such
     1.   Discovery Level Discrepancies                                 material, as in Section 3.2 where a lack of a single access point
                                                                        means that secondary information needs to be sought out
               a.   Targeted discovery based on search for terms        manually.
                    rather than meanings
                                                                        In reality, we have a chance to design and build support for the
               b.   Ontologies are discoverable independently of        meta-ontological in the light of current experience. OWL has
                    statements of purpose, scope etc.                   virtually no support, apart from some ad hoc solutions, for
               c.   Searches may discover anything from tutorial        carrying meta-ontological knowledge. We would advocate such a
                    OWL files, programmatic OWL fragments,              separation of the ontological from meta-ontological and this is
                    application   ontologies,    outdated     or        where a Web 2.0 approach could help.
                    unmaintained ontologies etc.                        Our current scenario places too much reliance on assessment
     2.   Ontology Level Discrepancies                                  through simple linguistic inspection of, for instance, terms. These
                                                                        are labels for concepts and a simple assumption of lexical
               a.   Statements of scope, purpose, expressivity etc
                                                                        matching implying conceptual matching is dangerous. For
                    are often missing altogether, or require extra
                                                                        example, in biology, it might seem safe to assume that hepatocyte
                    searches to discover them.
                                                                        and liver cells are the same thing. In fact, cells in the liver include
               b.   Discussions that have affected overall              hepatocyte cells, but also include adipocyte cells. Hepatocytes
                    ontology development are not recorded               make the liver the liver, but there are other cells too.
               c.   Minimal opportunities to interact with the          Ontologies are only intuitively discoverable through the
                    development team                                    identification and inspection of the appropriate individual terms.
     3.   Term Level Discrepancies                                      Even the construction of linguistic definitions can leave
                                                                        ambiguous meanings for those inspecting an ontology, with no
               a.   Feeding in from Section 3.1, ontologies need        real way to find out how those definitions were converged upon.
                    exploring in the development environment to         Even with logical definitions, we still rely upon natural language
                    assess appropriateness of terms.                    labels. The aim of languages such as OWL-DL is, however, to
                                                                        minimize potential ambiguities through logical descriptions.
Overall, there should be a synergy between logical and linguistic        knowledge in a way not possible in file-oriented development.
definitions.                                                             The ontologists have a way to interact with the domain experts as
Non-ontologist domain experts will attach intrinsic meaning to           a community to perform tasks such as the disambiguation of terms
terms by drawing on their internal knowledge and the context in          before they have been encoded in an ontology, reducing the
which a term is used. It is possible to restrict the intrinsic meaning   chance that major revisions of ontological structure will be
of a term using the consensus of a domain, so long as it is stated       required. As this resource is shared and linkable, project and
in the context of the purpose of the controlled vocabulary.              domain contexts for terms can be established. These contexts can
Interpretation of meaning in these controlled vocabularies still         be used by both the ontologists and the domain experts to traverse
requires a human agent and the knowledge is logically                    the gap into discussions that involve other groups, and discover
inaccessible to a computer agent.                                        overlapping scopes more intuitively. Additionally, these resources
                                                                         would provide ideal testing grounds for lexical research (e.g. [7])
Thus, we define the inline linguistic portions of ontology files as      that should lead to future improvement on methodologies for
meta-ontological data. These include anything that a human agent         these workspaces.
would use for the translation of specific complex logical
statements into meaning (including links to other meanings) but
are also intrinsically meaningless to the computer. Primarily,
these are:
     •    Terms
              o     The specific string by which the logical
                    meaning is labeled, usually considered as the
                    real meaning.
               o E.g. (from celltype ontology) ‘subsidiary cell’
      •   Synonyms
               o Any number of labels that refer to the same
                    meaning.
               o E.g. (cont’d) ‘accessory cell’
      •   Definitions
               o Short, concise description of the meaning,
                    including links to other terms.
               o E.g. (cont’d) ‘An epidermal cell associated
                    with a stoma and at least morphologically
                    distinguishable from the epidermal cells
                    composing the groundmass of the tissue’
      •   Annotations (examples of)
               o Longer, more verbose descriptions.
               o Examples of how the term is used.
               o Explanations of contextual use for the term.
               o Links to term provenance.
               o E.g. (cont’d) DBXREF - TAIR:0000296
Achieving the separation of this meta-ontological layer allows for
the consideration of how to manage this mostly linguistic
information. This separation is our major desideratum and from
this flows the means by which Web 2.0 can provide a platform to
expose meta-ontological information and harness and extend the
range of group activities.

4.2 Promoting Social Interaction – A Meta-
Ontological Workspace
Explicit logic based ontologies for the Semantic Web are going to
need to capture implicit knowledge with axioms and restrictions.
Yet, unless the experts with the knowledge all manage to learn
how to interpret complex logical statements, there needs to be a
workspace in which implicit knowledge can be discussed and
defined lexically within expert groups. In other words, terms and          Figure 2: An augmented form of ontology comprehension.
term linked information can essentially exist independently of the         Ontology Metadata and Meta-Ontological statements have
formal environment of ontologies. This implicit knowledge can be           been separated from the Logical Statements and has been
used by ontologists as a resource. With such a resource,                       exposed to a community using Web 2.0 principles.
development of early stage ontologies will not require the
                                                                         Generating discussion of implicit meaning may sound a little like
construction of formal hierarchies until a critical amount of
                                                                         cutting the domain expert out of the ontology construction
implicit knowledge has been collected in these more lightweight
                                                                         process. It should in fact considerably reduce the overhead of
resources. Also, multiple hierarchies for different purposes could
                                                                         ontology development by shifting the discussions based around
be constructed from the same resource, reusing the collected
intrinsic meanings of terms and which terms are the most                domain ontologies, this can be marked up and become visible
appropriate to use away from the attention of the ontologists. It is    such that ontology level provenance, a history of where
important not to make the division too wide, as there is a risk that    everything in an ontology originated from and how it changed
bias could creep in from the ontologists as the domain experts          over time, can start to be built up.
would be unable to assess the implications of certain restrictions      Introducing a strong community aspect would encourage those
and axioms. In terms of feedback to the domain expert from the          developing ontologies to start using tagging, thereby linking up
ontologist, we recognize that there is a need for some sort of          their ontologies to particular domains and projects. Domain
consistent translation methodology that can generate accurate           ontology construction could be promoted by using ranking
textual definitions from logical statements, but we consider this       systems where inspectors can rate how useful the ontology was in
outside of the scope of this article.                                   terms of what was expected, assuming that more general
It should be clear that such discussion workspaces would be well        ontological models will fit the requirements of more people.
suited for Web 2.0 style systems. These workspaces should               OWL has been made popular for use as an ontology language
promote the creation of lexicons in which a group of experts can        because of the publicity of the Semantic Web, accessibility of
start to add in and inspect lexical information. In this implicit       tools for creating OWL ontologies and the fact that it is useful
view, it is the terms that are the focus of discussion, not the         beyond the scope of the Semantic Web. OWL has been used for a
ontological interpretation, which are two different goals that          lot of purposes, and searching for ontologies based on the content
sometimes get confused during ontology development. Within the          of their files seems like it may be unsustainable as the number of
workspace, the terms can be discussed, and annotated with textual       files grows faster than the number of useful ontologies.
definitions, comments about usage, links to synonymous terms,
requests for clarification etc. Helium was, for instance, discovered    Efficient inspection of ontologies can be limited by a large size of
in 1894. Of course it was the category of Helium that was               ontologies. The current tendency is to build larger ontologies, as
discovered, not the instances of the helium atoms (which                the tool support and methodologies for modularization have been
presumably have existed much before 1894). This is an example           slow off the mark until recently. As we learn more about the
of meta-class statements that are part of the ontology. They are        implications and methods of modularization [22], ontologies can
class level statements, but those that are well suited to this          become more manageable, reducing the amount of evaluation cost
linguistic, community style of interaction.                             per ontology. This of course will require better indexing, along
                                                                        with information about how each ontology has been
The purpose of targeting Web 2.0 as a base for this meta-               used/imported, perhaps leading to a ‘shopping cart’ model for
ontological data is not to completely remove this type of               highly modular ontology construction.
information from the view of the ontology, we merely seek to
relocate it so that the incredibly social nature of the definition of   Perhaps one of the most motivating factors for achieving this
knowledge can be coupled with an environment that is equally            desire for more effective inspection is the aspect of learning. Once
socially driven (see Figure 2). Modularization of ontologies is         it becomes easy to empirically see what constitutes a good and
seen to be one of the keys to making ontologies viable for the          useful ontology, then these features get propagated and discussed.
Semantic Web vision, and as such, import mechanisms exist that          As has been noted in [1], the viral spread of understanding how to
support the combination of different sources. Lexicons developed        write HTML was in part because existing HTML could be
by groups could be given URIs, as could all of the terms                inspected and copied. Also, the effect of newly written HTML
described in them. Knowledge held in WordNet [17] style lexical         was instantly verifiable in a Web browser. It is harder to have this
resources could be linked using online URIs in a similar way to         sort of verification with ontologies, and there are a lot of
imported online ontologies taking advantage of well established         conflicting styles of ontology development with no consensus of
methods for dealing with words and their meanings at the lexical        what is ‘right’, If the Ontological Metadata Workspace were to be
level.                                                                  realized, then a hub of comparable, commented and marked-up
                                                                        ontologies could develop in a much quicker and consistent
                                                                        fashion than the solitary efforts that are currently the norm.
4.3 Promoting Ontology Sharing and Reuse:
An Ontological Metadata Workspace                                       5. BIO-ONTOLOGIES: EXPERIENCES
The production of ontologies that can be effectively shared and
reused is a major step towards achieving the goals of the Semantic      AND PERSPECTIVES
Web. There are significant barriers to these goals in our current       While our discussion in this article is most pertinent to the notion
model of ontological comprehension. We have highlighted how             of the Semantic Web as a whole, it originates from the discipline
ontologists and domain experts alike need to inspect ontologies to      of bioinformatics. Biologists were early adopters of the Web as a
assess whether they are appropriate for their needs. Currently, the     means of disseminating data and the tools for their analysis. These
information that would be necessary to effectively conduct this         data and tools are developed in a highly autonomous manner and
investigation is hard to find, and does not always come in the          consequently they are beset by both syntactic and semantic
same format.                                                            heterogeneities. Bioinformaticians have seen ontologies as a
                                                                        means to create common understandings for human and
An ontological metadata workspace would provide access to               computers about the meaning of data in their distributed resources
whole-ontology level information for ontologies necessary to            in a life science Semantic Web [13]. The DNA sequences of
carry out light evaluative processes. A collaborative Web 2.0           different organisms, for example have a common representation,
approach to ontologies would see ‘ontology profiles’ that include       but this is not so for the functional knowledge associated with
clear statements about the purpose and scope of the ontology and        those sequences. So, the sequences can be interpreted by humans
information regarding its status. Ontologies would clearly be           and computers, but not what is known about those sequences.
labeled as domain and application ontologies to help evaluation,
and subsequently, when application ontologies are derived from
Consequently, biologists have created ontologies to describe, for        such as ‘develops_from’. Despite having the full expressivity of
instance, the functional attributes of DNA and proteins [3, 11].         OWL available in the OBO 1.2 syntax, there is little evidence to
Bioinformatics has, therefore, much Web accessible data                  suggest that the developers in this community either see the need
described by ontologies. The W3C have recognized a nascent               or have the will to take on this level of expressivity in their
Semantic Web in this domain in the development of the Health             knowledge.
Care and Life Sciences SIG 5 . It is a significant feature of the        Perhaps then, this community can be a model for the future of
move towards ontologies in this sector that it is biologists who         ontology development on the Web. Quick and easy development
build these tools, with some guidance from ontologists. Whilst           of terms by engaging the user, employing Web 2.0 design
this community has not made great use of OWL, but its own                principles to forge more coordinated communities for
representation, OBO 6 , it still provides a good representation of       development of Semantic Web technologies. Web 2.0 has the
Semantic Web activities.                                                 capability to expose all of the ‘light’ lexical issues and some basic
The OBO ontologies have significant standing in biological               assertions of linking meaning to terms. ‘Heavier’ more expressive
communities, and it is perhaps the community building aspect that        assertions in OWL are in the domain of the ontologist, who can be
fuels this standing, as it includes:                                     informed by the interactions they can have with domain experts
                                                                         and other ontologists through Web 2.0 communities.
        •    A large number of centrally available OBO ontologies 7
        •    The OBO-Edit OBO ontology development tool that is          6. DISCUSSION
             specifically designed by a working group of users.          We propose the construction of ontology specific resources, using
                                                                         the Web as a platform, which specifically deals with the
        •    A committee, the OBO-Foundry 8 , that has been set up
                                                                         management of lexical meta-ontological aspects of ontology
             and has produced a set of principles for new OBO
                                                                         development together with the management of ontology metadata.
             ontologies to aspire to, including the promise of textual
                                                                         The applications of Web2.0 are geared towards harnessing these
             definitions for all terms and good documentation for all
                                                                         types of community interaction, which is precisely the sort of
             ontologies.
                                                                         interaction that is not supported in the current model of ontology
        •    The OBO file format, for which the primary goals            development. Dealing with meta-ontological data in
             include human readability and ease of parsing together      downloadable ontology files and disparate descriptions of
             with a syntax that makes them exportable as OWL.            ontology metadata on development sites is prohibitive to a more
        •    Pages on the SourceForge 9 open source software             universal appreciation of ontology design and implementation.
             development site, which includes the potential for          A centralized resource for sharing OWL resources would act as a
             project information, forums, downloads and issue            hub for community learning, sharing and reusing of ontology
             tracking by which suggestions for new terms and             resources, bringing together ontology users and builders in a way
             modifications can be submitted.                             that is currently not possible. Designing ontologies by consensus
Contributors to OBO are starting to pull together as a virtual           in such workspaces would encourage best practice and speed up
community by pooling its resources on the Web. The Gene                  the uptake of the more complicated Semantic Web technologies,
Ontology [3] saw a phenomenal growth in the number of terms it           starting with OWL and the knowledge that is to be contained
contained through user interaction alone that is well documented         within. At the same time the system would provide a measure of
[5], such was the demand for the resource to represent so many           control, ensuring that the dangers of misinterpreting the powerful
researchers. Since then, the trend has continues as more and more        semantics of OWL by untrained eyes are avoided. Having the
biological domains aim to be represented by OBO.                         community built lexical resources is the beginning of an
                                                                         opportunity to link up ontologists with a more specific system that
The caveat for the relative success of OBO has probably been             can refer to the online lexical corpus.
similar to that of Web 2.0 over Semantic Web (so far). Formality
and methodology have temporarily made way for ease of use and            The widespread realization of the Semantic Web will depend on
ease of interaction. Interestingly, the majority of the OBO              the production of ontologies that can be effectively shared and
ontologies clearly state that they are “structured controlled            reused, but in order to achieve this, the overheads of ontology
vocabularies”, which require nothing like the expressive power of        development and ontology comprehension have to be
OWL, and little in the way of knowledge engineering because the          considerably reduced. The OBO community/consortium has
statements linking things do not require it. This is not for any         effectively demonstrated the advantages of lowering these
other reason than nothing more complex than this is required,            overheads by engaging a community of domain experts in
OBO ontologies are used for marking biological data so that they         ontology development. OBO ontologies, however, are for human
can be linked if they are annotated in the same way. Primarily,          interpretation, so the true Semantic Web vision of human and
these ontologies contain a hierarchy of terms denoting ‘is_a’            computational understanding is not addressed. At the same time,
relationships. Less often but still common are ‘part_of’                 highly expressive OWL-DL ontologies, for both computer and
relationships, and occasionally other properties key to biology          human interpretation are being produced, but largely in isolation.
                                                                         We propose a solution here which would bridge the gap between
5
                                                                         these approaches and effectively enable the same type of domain
    http://www.w3.org/2001/sw/hcls/                                      expert community engagement for formal ontologies.
6
    http://www.geneontology.org/GO.format.obo-1_2.shtml
7
    http://obo.sourceforge.net/                                          7. ACKNOWLEDGMENTS
8                                                                        Funding for this work was through BBSRC grant BBS/B/17156.
    http://obofoundry.org/
9
    http://sourceforge.net/
8. REFERENCES                                                             and metadata engine for the semantic web Proceedings
                                                                          of the thirteenth ACM international conference on
[1]    Alani, H., Position paper: ontology construction from              Information and knowledge management, ACM Press,
       online ontologies. in Proceedings of the 15th                      Washington, D.C., USA, 2004.
       International Conference on World Wide Web                  [17]   Miller, G.A. WordNet: a lexical database for English.
       (Edinburgh, Scotland, 2006), ACM Press, New York,                  Communicationd of the ACM, 38 (11). 39-41.
       NY, 491-495.                                                [18]   Noy, N.F., Guha, R. and Musen, M.A., User Rating of
[2]    Alani, H., Brewster, C. and Shadbolt, N., Ranking                  ontologies: Who will rate the raters? in AAAI 2005
       Ontologies with AKTiveRank. in International                       Spring Symposium on Knowledge Collection from
       Semantic Web Conference, (Athens, GA, USA, 2006).                  Volunteer Contributors, (Stamford, CA, USA, 2005).
[3]    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D.,       [19]   Noy, N.F. and Klein, M. Ontology Evolution: Not the
       Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K.,               Same as Schema Evolution. Knowledge and
       Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P.,               Information Systems, V6 (4). 428-440.
       Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C.,   [20]   Pulido, J.R.G., Ruiz, M.A.G., Herrera, R., Cabello, E.,
       Richardson, J.E., Ringwald, M., Rubin, G.M. and                    Legrand, S. and Elliman, D. Ontology languages for the
       Sherlock, G. Gene Ontology: tool for the unification of            semantic web: A never completely updated review.
       biology. Nat Genet, 25 (1). 25-29.                                 Knowledge-Based Systems, 19 (7). 489-497.
[4]    Austin, M.A., III and Samadzadeh, M.H., Software            [21]   Rector, A., Drummond, N., Horridge, M., Rogers, J.,
       comprehension/maintenance: an introductory course. in,             Knublauch, H., Stevens, R., Wang, H. and Wroe, C.
       (2005), 414-419.                                                   OWL Pizzas: Practical Experience of Teaching OWL-
[5]    Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner,              DL: Common Errors & Common Patterns, 2004.
       M., Blake, J.A., Cherry, J.M., Harris, M. and Lewis, S.     [22]   Rector, A.L. Modularisation of domain ontologies
       A short study on the success of the Gene Ontology. Web             implemented in description logics and related
       Semantics: Science, Services and Agents on the World               formalisms including OWL Proceedings of the 2nd
       Wide Web, 1 (2). 235-240.                                          international conference on Knowledge capture, ACM
[6]    Berners-Lee, T., Hendler, J. and Lassila, O. The                   Press, Sanibel Island, FL, USA, 2003.
       Semantic Web. Scientific American, 284. 34-43.              [23]   Rosse, C. and Mejino, J.L.V. A reference ontology for
[7]    Bodenreider, O., Burgun, A. and Rindflesch, T.C.                   biomedical informatics: the Foundational Model of
       Assessing the consistency of a biomedical terminology              Anatomy. Journal of Biomedical Informatics, 36 (6).
       through lexical knowledge. International Journal of                478-500.
       Medical Informatics, 67 (1-3). 85-95.                       [24]   Sabou, M., Lopez, V., Motta, E. and Uren, V., Ontology
[8]    Brank, J., Grobelnik, M. and Mladenic, D., A survey of             Selection: Ontology Evaluation on the Real Semantic
       ontological evaluation techniques. in Conference on                Web. in WWW2006, (Edinburgh, UK, 2006).
       Data Mining and Data Warehouses, (Ljubljana,                [25]   Schaffert, S., Gruber, A. and Westenhaler, R., A
       Slovenia, 2005).                                                   Semantic Wiki for collaborative knowledge formation.
[9]    Brickley, D. and Miller, L. FOAF vocabulary                        in Semantics, (Vienna, Austria, 2005).
       specification, 2005.                                        [26]   Shadbolt, N., Hall, W. and Berners-Lee, T. The
[10]   Corcho, O., Fernandez-Lopez, M. and Gomez-Perez, A.                Semantic Web revisited. IEEE intelligent systems, 21
       Methodologies, tools and languages for building                    (3). 96-101.
       ontologies. Where is their meeting point? Data &            [27]   Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby,
       Knowledge Engineering, 46 (1). 41-64.                              A., Paton, N.W., Goble, C.A. and Brass, A. TAMBIS:
[11]   Eilbeck, K., Lewis, S., Mungall, C., Yandell, M., Stein,           Transparent Access to Multiple Bioinformatics
       L., Durbin, R. and Ashburner, M. The Sequence                      Information Sources. Bioinformatics, 16 (2). 184-186.
       Ontology: a tool for the unification of genome              [28]   Uschold, M. Knowledge level modelling: concepts and
       annotations. Genome Biology, 6 (5). R44.                           terminology. The Knowledge Engineering Review, 13
[12]   Golbreich, C., Zhang, S. and Bodenreider, O. The                   (1). 5-29.
       foundational model of anatomy in OWL: Experience            [29]   Vrandecic, D., Pinto, S., Tempich, C. and Sure, Y. The
       and perspectives. Web Semantics: Science, Services and             DILIGENT knowledge processes. Journal of
       Agents on the World Wide Web, 4 (3). 181-195.                      Knowledge Management, 9 (5). 85-96.
[13]   Good, B.M. and Wilkinson, M.D. The Life Sciences            [30]   Whetzel, P.L., Parkinson, H., Causton, H.C., Fan, L.,
       Semantic Web is Full of Creeps! Brief Bioinform, 7 (3).            Fostel, J., Fragoso, G., Game, L., Heiskanen, M.,
       275-286.                                                           Morrison, N., Rocca-Serra, P., Sansone, S.-A., Taylor,
[14]   Guarino, N. and Christopher, W. Evaluating ontological             C., White, J. and Stoeckert, C.J., Jr. The MGED
       decisions with OntoClean. Commun. ACM, 45 (2). 61-                 Ontology: a resource for semantics-based description of
       65.                                                                microarray experiments. Bioinformatics, 22 (7). 866-
[15]   Kirk, D., Roper, M. and Wood, M., Identifying and                  873.
       addressing problems in framework reuse. in, (2005), 77-     [31]   Wolstencroft, K., Lord, P., Tabernero, L., Brass, A. and
       86.                                                                Stevens, R. Protein classification using ontology
[16]   Li, D., Tim, F., Anupam, J., Rong, P., Cost, R.S., Yun,            classification. Bioinformatics, 22 (14). e530-538.
       P., Pavan, R., Vishal, D. and Joel, S. Swoogle: a search