=Paper=
{{Paper
|id=Vol-2137/paper_21.pdf
|storemode=property
|title=Facets, Tiers and Gems: Ontology Patterns for the Hypernormalisation
|pdfUrl=https://ceur-ws.org/Vol-2137/paper_21.pdf
|volume=Vol-2137
|authors=Phillip Lord,Robert Stevens
|dblpUrl=https://dblp.org/rec/conf/icbo/LordS17
}}
==Facets, Tiers and Gems: Ontology Patterns for the Hypernormalisation==
Facets, Tiers and Gems: Ontology Patterns for
Hypernormalisation
Phillip Lord 1∗and Robert Stevens2
1
School of Computing Science,
Newcastle University,
Newcastle-upon-Tyne
2
School of Computer Science,
University of Manchester,
Manchester
ABSTRACT refining concepts that form closed, covering and disjoint hierarchies.
There are many methodologies and techniques for easing the Building an ontology in this way, allows the ontology developer to
task of ontology building. Here we describe the intersection of two exploit the reasoner to build a polyhierarchy by using classes that
of these: ontology normalisation and fully programmatic ontology define the self-standing entity in terms of the refining partitions.
development. The first of these describes a standardized organisation Polyhierarchies are difficult to build manually, as human ontology
for an ontology, with singly inherited self-standing entities, and a developers, no matter how good their domain knowledge, find it
number of small taxonomies of refining entities. The former are hard to ensure all possible parents of an entity are taken into account.
described and defined in terms of the latter and used to manage The normalisation approach uses defined classes and reasoning to
the polyhierarchy of the self-standing entities. Fully programmatic remove this chore. Creating the tree of self-standing entities still,
development is a technique where an ontology is developed using a however, remains as a task for the developer. The normalisation
domain-specific language within a programming language, meaning approach can significantly increase the robustness and reduce the
that as well defining ontological entities, it is possible to add arbitrary work of manual maintenance (Wroe et al., 2003). In this latter form,
patterns or new syntax within the same environment. We describe ontological normalisation has been widely, if implicitly, used.
how new patterns can be used to enable a new style of ontology While the term “ontology normalisation” has been borrowed,
development that we call hypernormalisation. somewhat metaphorically, from database engineering, the process
of building ontologies using a set of standard design patterns
has a rather more direct relationship to the software engineering
1 INTRODUCTION
equivalent. By reusing a standard set of patterns, it is possible to
Building ontologies is a difficult and time-consuming business for a build an ontology both rapidly, and consistently. This has manifested
number of reasons: from an abstract point-of-view knowledge about itself in a number of different ways, with a number of different tools,
the domain can be difficult to gather, to understand and to represent such as TermGenie (Dietze et al., 2014), or Populus (Jupp et al.,
ontologically; more, immediately, ontologies, especially those with 2011) which can generate ontologies according to a pattern.
a complex representation, can be taxing to describe and define We have previously described a fully programmatic methodology
consistently, to update, expand or change when that representation for ontology development (Lord, 2013), using the Tawny-OWL
needs to change. environment. This is built around the programming language
There have been numerous attempts to simplify and clarify Clojure and enables the ontology to take advantage of all the features
this process including: the development of methodologies such as of a programming language and its environment, including unit
OntoClean that defines a set of meta-properties that can inform testing (Warrender and Lord, 2015), build, evaluation and, of course,
ontological modelling (Guarino and Welty, 2002); upper ontologies pattern-driven development by simple use of functions (Warrender
such as DOLCE or BFO (Grenon et al., 2004) that provide a and Lord, 2013). With respect to patterns, this environment has
pre-made upper classification. several advantages. First, and unlike tools such as Populous and
Another approach that can leverage both of these techniques is OPPL (Egana Aranguren et al., 2009), patterns are developed in
ontology normalisation (Rector, 2002). Originally intended as a the same environment and syntax as simple ontology concepts; it
mechanism for “untangling” existing hierarchies or classifications is, therefore, as easy to define a pattern as it is to define a class.
being reused as the basis for an ontology, it also has significant use Second, being based on Clojure, a language which is homoiconic
as a pattern for building ontologies de novo. and has very little syntax of its own, it is possible to build arbitrary
Broadly, a normalised ontology is defined using a skeleton that is syntactic constructions to represent patterns in a way that is both
a strict tree (i.e. not a acyclic graph) of concepts differentiated using convenient and attractive to the developer.
an inheritance (i.e. not a partonomy) relationship. These are further In this paper, we describe an extension of the normalisation
split into: a set of self-standing entities in which children are disjoint technique that we call hypernormalisation. This technique is
from each other, but do not cover the parent, and partitioning or typified by the (near or complete) absence of asserted hierarchy
among the self-standing entities. We describe how this allows
∗ To whom correspondence should be addressed: construction of an exemplar ontology of amino-acids (Stevens and
phillip.lord@newcastle.ac.uk Lord, 2012). We then move on to describe recent developments
1
Lord and Stevens
Top Level
Self-Standing Entities Refining Type
Body Substance Person Role Value Types
Protein Steroid Organic Ion Care Role Patient Role Age Type Sex
Doctor Role Nurse Role Adult Child Male Female
Fig. 1. A normalised ontology slightly modified from Rector (2002). The graph does not necessarily reflect subsumption, see text for details.
in the Tawny-OWL environment, including the definition of two the self-standing entities are split into only three sets: the amino-
new design patterns, the tier and the facet, and one syntactic acids themselves (e.g. Alanine); a (very large) set of defined
abstraction, the gem, can be used to enable hypernormalised classes describing the refined types of amino-acid (e.g. Small
ontology development. Finally, we discuss the application of this Neutral Amino Acid); and, finally, the single class Amino Acid.
approach to other ontologies. Or, stated alternatively, it contains no skeleton hierarchy at
all, and all relationships between the self-standing classes are
2 HYPERNORMALISATION AND AMINO ACIDS arrived at through reasoning. This is particularly relevant for
the amino acid ontology as it contains over 500 defined classes,
Normalisation is a methodology that aims to disentangle an
with subsumption relationships to the amino acids and between
ontological structure, in the process managing its maintainability,
themselves. Maintaining this form of ontology by hand would be
utility and expressivity of the ontology generated. To achieve
impractical.
this, the ontology is split into two main hierarchies: self-standing
We call this style of ontology development hypernormalised.
entities and refining types, see Figure 2 for an example. The
We believe that it is a natural extension of normalisation. Rector
self-standing hierarchy contains entities with a central hierarchy
notes, for example, that the choice of aspect to form the skeleton
or skeleton. In this part of the ontology, we would expect that
is “to some degree arbitrary”, but that they should be rigid (after
hierarchy contains levels that are not-exhaustive – that is the
OntoClean (Guarino and Welty, 2002)) and pragmatically stable
children do not cover the parents, and parents are not closed to new
(i.e. unlikely to change during the evolution of the ontology). Both
children. This is contrasted by the refining hierarchy that consists of
of these are, however, true for all the refining concepts in the amino-
classes that are exhaustive; in many cases, children will be non-
acids. In short, not only is the choice of skeleton arbitrary it is
overlapping and, therefore, disjoint. This is not to say that the
actually unnecessary and brings no further utility to the ontology
refining types hierarchy are necessarily complete: in Figure 2, for
than that which can be achieved by use of reasoning.
example, the representation of Sex is too simple for many medical
We note that the distinction between normalisation and
uses, but might be sufficient for a customer relations system. In
hypernormalisation is not absolute, but one of degree; we are simply
general, the self-standing entities will be defined in terms of the
describing the tendency toward an ontology with an flat asserted
refining types, while polyhierarchical relationships between the
hierarchy.
self-standing entities will be determined through use of a reasoner.
Having introduced the notions of a hypernormalised ontology, we
This form of ontology development is quite different from an
next consider a set of new patterns in Tawny-OWL that enable this
upper ontology and agnostic to the choice of upper ontology or
style of ontology development.
none. While Rector (2002) suggests only that self-standing entities
and refining types should be “made clear by some mechanism”; in
OWL, it could be an upper ontological term, or an annotation. 3 PATTERNISING AND TAWNY-OWL
We next introduce the amino-acid, used here as an exemplar, The Tawny-OWL environment (Lord, 2013) and its ability to
which defines the biological amino-acids in terms of the support patterns (Warrender and Lord, 2013) has been described
physiochemical properties most relevant to their biological role. elsewhere in detail; here, we provide a quick overview, so that the
It is a structurally interesting ontology because it is normalised, rest of the paper is clear. Tawny-OWL is implemented as a DSL
with a clear and clean separation between the self-standing entities (domain-specific language) in Clojure, which is a Lisp-like language
and the five refining concepts. It is rather more than this, though; implemented in Java, and running on the Java virtual machine.
2
Facets, Tiers and Gems
Top Level
Self-Standing Entities Refining Type
Defined Class Amino Acid Alanine Arginine 18 more Size Charge Hydro. Polarity SideChain
Small AA S. Neutral AA S. N. Aliphatic AA 500 more Tiny Small Large
Fig. 2. A hypernormalised ontology representing the amino-acids using the same terminology as Figure 2. Some labels have been abbreviated.
Tawny-OWL itself wraps the OWL API (Horridge and Bechhofer, define this pattern in the same environment, or side-by-side in the
2011); this is the same library that underpins Protg, and from it, same file as a simple class definition; with Tawny-OWL it is as easy
Tawny-OWL gains much of its functionality. Simple sections of the to define a class, as to define and use a new pattern. Ontologies
ontology can be generated using a syntax based on a “lispified” such as the Karyotype ontology make extensive use of this facility
version of Manchester OWL Notation; for example, the following moving freely between ontology and pattern definitions, as well as
code: literal data structures, utility functions and unit tests (Warrender and
Lord, 2013).
(defclass A Tawny-OWL is now a mature and used software product; the first
:super B) alpha release of Tawny-OWL was in Nov 2012, first full release,
Nov 2013, followed by four point releases to 2016. This paper
This declares a new class A that has the pre-existing class B describes mostly the upcoming v2.0 release, although some of the
as a superclass 1 which in Manchester OWL notation would be features described were available in earlier versions.
expressed as:
Class: o:A 4 THE VALUE PARTITION
SubClassOf: A common pattern for building a normalised ontology is called the
o:B value partition. This pattern (Rector, 2005) addresses the problem
of the ontological modelling of a continuous range. For example, in
This code is entirely valid Clojure and can be evaluated in any modelling the amino-acids, we can consider the concept of Size;
Clojure environment, such as CIDER/Emacs or Cursive/IntelliJ. It this could be described directly using the molecular weight of the
is also possible to define new patterns: for example the following amino-acid. However, for the purpose of the amino-acids, it is both
pattern definition: easy and general practice to split size into three categories: tiny,
small and large. In Tawny-OWL, this can be achieved straight-
(defn some-only [property & clazzes] forwardly using the defpartition function3 .
(list (some property clazzes)
(only property (or clazzes)))) (defpartition Size
[Tiny Small Large]
defines the some-only pattern which generates a set of :domain AminoAcid
existential restrictions and one universal with the union of the :super PhysioChemicalProperty)
existential fillers as its filler, which implements the ontological
closure pattern. This is a function definition in Clojure terms: defn
Axiomatically, this expands into: a class Size; three subclasses,
introduces the function, property & clazzes is the argument
Tiny, Small and Large; and, a property hasSize. The
list, some, only and or are functions provided by Tawny-OWL
and list returns, prosaically, a list2 . Critically, it is possible to
3 For those with knowledge of Lisp, this is actually a macro; the main
1 See Lord (2014) an explanation of why :super is used rather than implementation is in the value-partition function. Tawny-OWL
:subclass. provides support for implementing syntactic macros whose function is
2 The function shown here is a slightly simplified version of one provided simply to allow the use of bare symbols. For those without knowledge of
in Tawny-OWL. Lisp, the distinction is not important!
3
Lord and Stevens
property is functional, has range of Size and domain of The use of :suffix true causes a simple change to the
AminoAcid. Expanded, this would be expressed as follows4 : naming of the entities: Positive will become PositiveCharge
which would be expanded as follows:
Class: o:Large
SubClassOf: Class: o:PositiveCharge
o:Size SubClassOf:
o:Charge
Class: o:Size
EquivalentTo: Other names are modified equivalently. By default, this will
o:Large or o:Small or o:Tiny manifest both when referring to the class in the Tawny-OWL
environment, in the IRI of the concept when serialized as OWL, and
SubClassOf: in the value of an annotation on the concepts5 . In addition to naming,
o:PhysioChemicalProperty it is also possible to optionalise: whether or not the subclasses are
disjoint, covering, whether the property is functional or whether it
Class: o:Small is created at all.
SubClassOf: The tier is a more general pattern than the value-partition; in fact,
o:Size in the current version of Tawny-OWL, the latter is defined in terms
of the former.
Class: o:Tiny
SubClassOf: 6 THE FACET
o:Size
Both the value partition and tier introduce a new object property
named after the tier, and with a range limited to the classes defined
DisjointClasses:
within the tier. The converse is also true; where we use one of the
o:Large,o:Small,o:Tiny
tier classes, such as PositiveCharge it is most likely that we
wish to use it with the hasCharge property defined as part. Taken
The subclasses are disjoint and cover the parent. Following the together, we describe the combination of classes and a property as a
terminology from Rector (2002), the value partition is useful for facet. Facets are a well known technique, first proposed in a library
defining partitioning or refining concepts. classification (the Colon Classification (Ranganathan, 1933), named
after the use of “:” as a separator). They are now common-place as
5 THE TIER seen with facetted browsers used by many websites for navigation
The value partition is a pattern aimed at a specific purpose – of complex product catalogues.
segmenting a continuous range. In practice, though, we have found Tawny-OWL provides explicit support for facets, allowing the
that the axiomatization of this pattern is more generally useful. For association of a property and a set of classes, as demonstrated by
example, considering the amino-acid ontology, it is natural to model the following code:
the chemistry of the side-chain as such:
(as-facet
(defpartition SideChainStructure hasCharge
[Aromatic Aliphatic]
:domain AminoAcid Positive Neutral Negative)
:super PhysicoChemicalProperty)
The practical implication of this is that we can now use the
While this is intuitive, ontologically, SideChainStructure facet function to return an existential restriction providing just a
is actually of a very different form from Size, as it does not reflect class. We can express this programmatically; for example, we might
a spectrum. Either the side-chain contains a benzene ring, making use the assert function provided by Clojure’s unit test framework.
it aromatic, or it does not. This form of partition was also noted (assert
in Rector (2002) which includes the classes Male and Female (= (some hasCharge Positive)
which is not a spectrum, at least in this simplified representation. (facet Positive)))
We introduce here, therefore, the more general notion of the tier:
a small set of concepts in a one-deep hierarchy. The tier function
By itself, this ability is only slightly more succinct. However,
supports a range of options:
when used with multiple facetted classes, the advantages become
(deftier Charge considerably clearer, as can be shown by the following assertion.
[Positive Neutral Negative] (assert
:domain AminoAcid (= (list (some hasCharge
:super PhysioChemicalProperty
:suffix true)
5 The duplication between the annotation and the IRI fragment is there
because IRI schemes such as numeric style OBO IDs; annotations have been
4 Tawny-OWL also adds annotations which have been elided elided for brevity
4
Facets, Tiers and Gems
Neutral) explicit in the OWL serialization. Tawny-OWL actually uses
(some hasHydrophobicity these annotations internally, for example, to enable the facet
Hydrophobic) functionality by providing a relationship between the classes and
(some hasPolarity the appropriate object property. This is a strictly an implementation
NonPolar) detail and could have been achieved without annotations; however,
(some hasSideChainStructure we believe that it shows the value of having this knowledge explicit
Aliphatic) in OWL.
(some hasSize
Tiny)) 9 DISCUSSION
(facet Neutral Hydrophobic NonPolar
Aliphatic Tiny))) In this paper, we describe how we have used Tawny-OWL to provide
higher-level patterns which can be applied to ontology development.
The patterns provide both functionality and syntactic abstraction
In addition to succinctness, this pattern also reduces the risk of
over the underlying OWL implementation. In the process, they
errors; a class such as Tiny will always be used with its correct
enable the easy and accurate construction of ontologies.
property. Without the use of facets, the ontology developer must
More specifically, we demonstrate two new patterns: the tier and
achieve this by hand. It would also be possible to detect the error
the facet. The tier is an extension of the existing value partition
using reasoning, although this will only succeed if appropriate range
pattern and can be used for the generation of many small hierarchies
and disjoint restrictions are in the ontology. The defpartition
that can be used as refining properties. The facet borrows from the
and deftier functions, of course, both add these range and
library sciences notion of a facetted classification, and is used to
disjoint restrictions and declare their classes as facets of their
associate a set of classes with a specific set of values. This form of
properties.
classification is very common in the web; the majority of web stores,
for example, offer facetted browsing, often with the facets changing
7 THE GEM for different subsections of the catalogue.
Finally, we define the gem that provides a syntactic abstraction for Taken together, these two patterns enable a new form of ontology
a class composed entirely or mainly from facets. Following the development, hypernormalisation, which is an extreme form of
terminology from Rector (2002), this abstraction would be useful normalisation. In this form of normalisation, we do away with
mostly for self-standing concepts. For example, we could define the the creation of a tree of self-standing entities and instead rely
amino acid alanine using the following defgem statement. on the reasoner to build all the hierarchy. As well as making
the ontologist’s task easier, it makes the characteristic that would
(defgem Alanine have been used to create the tree of self-standing entities explicit
:comment "An amino acid with a single in the form of a refining characteristic. Here, we have described
methyl group as a side-chain." the application of this methodology to the exemplar amino-acid
:facet Neutral Hydrophobic NonPolar ontology. Of course, it is dangerous to extrapolate to generality from
Aliphatic Tiny) an exemplar, but we have also started to apply hypernormalisation
to ontologies of other, more real, domains including clouds (in
The other amino-acids can be likewise defined as a series of the meterological sense), cell lines and a reworking of the Gene
gems. In fact, the amino acids are so regular, all having the same Ontology. The tier has been made generic; it does not require, for
five facets, that we use a further syntactic abstract specific to example, that all refining types are closed (i.e. all possibilities are
the amino-acid ontology – a form of pattern that we describe as known in advance) nor disjoint.
localized (Warrender, 2015). The gem represents generalised syntax Clearly, not all forms of ontology will naturally be represented
useful for developing any ontology. in a hypernormalised form. For example, the Karyotype
ontology (Warrender and Lord, 2013) is far from this form; here, we
8 ON ANNOTATION define the self-standing concepts and then use reasoning over a set of
We have previously discussed the relationship between a design defined classes which effectively operate as facets (Warrender and
methodology such as normalisation and the use of an upper Lord, 2015). However, the popularity of the facetted browsers shows
ontology. The Tawny-OWL patterns described here are all that is possible to use this form of classification in many areas. We
orthogonal and agnostic to the choice of an upper ontology or to believe that the introduction of the concept of hypernormalisation
none. They do not place their entities in any particular part of the and the implementation of it in Tawny-OWL could have significant
class hierarchy nor define classes outside of those required for the implications for the future development of ontologies.
domain ontology, although they could be easily extended to do so
should the ontology developer require. REFERENCES
However, we agree with Rector (2002) that the use of patterns Dietze, H., Berardini, T. Z., Foulger, R. E., Hill, D. P., Lomax, J., OsumiSutherland,
should “made clear” and be explicit within the ontology. For D., Roncaglia, P., and Mungall, C. J. (2014). Termgenie - a web application for
this reason, all of the patterns described here also make use of pattern-based ontology class generation. Journal of Biomedical Semantics, 5(1), 48.
annotations, using annotation properties defined using its own Egana Aranguren, M., Stevens, R., and Antezana, E. (2009). Transforming
the axiomisation of ontologies: The ontology pre-processor language. Nature
internal annotation ontology. For example, all entities generated Precedings.
as a result of a pattern such as deftier are explicitly annotated Grenon, P., Smith, B., and Goldberg, L. (2004). Biodynamic ontology: applying BFO
as such. This means that the use of these patterns is (informally) in the biomedical domain. Stud Health Technol Inform, 102, 20–38.
5
Lord and Stevens
Guarino, N. and Welty, C. (2002). Evaluating ontological decisions with ontoclean. Systems (OMAS) in conjunction with European Knowledge Acquisition Workshops.
Commun. ACM, 45(2), 61–65. Siguenza, Spain.
Horridge, M. and Bechhofer, S. (2011). The OWL API: A Java API for OWL Stevens, R. and Lord, P. (2012). Semantic publishing of knowledge about amino acids.
Ontologies. Semantic Web Journal, 2. http://ceur-ws.org/Vol-903/paper-06.pdf.
Jupp, S., Horridge, M., Iannone, L., Klein, J., Owen, S., Schanstra, J., Wolstencroft, K., Warrender, J. (2015). The Consistent Representation of Scientific Knowledge:
and Stevens, R. (2011). Populous: a tool for building owl ontologies from templates. Investigations into the Ontology of Karyotypes and Mitochondria. Ph.D. thesis,
BMC Bioinformatics, 13(Suppl 1), S5. School of Computing Science, Newcastle University.
Lord, P. (2013). The Semantic Web takes Wing: Programming Ontologies with Tawny- Warrender, J. and Lord, P. (2013). A pattern-driven approach to biomedical ontology
OWL. OWLED 2013. engineering. SWAT4LS 2013.
Lord, P. (2014). Manchester syntax is a bit backward. http://www.russet.org. Warrender, J. D. and Lord, P. (2013). The Karyotype Ontology: a computational
uk/blog/2985. representation for human cytogenetic patterns. Bio-Ontologies 2013.
Ranganathan, S. (1933). Colon Classification. Warrender, J. D. and Lord, P. (2015). How, What and Why to test an ontology.
Rector, A. (2005). Representing specified values in owl: “value partitions” and “value Wroe, C., Stevens, R., Goble, C., and Ashburner, M. (2003). A methodology to
sets”. W3C Working Group Note. migrate the gene ontology to a description logic environment using daml+oil. Pacific
Rector, A. L. (2002). Normalisation of ontology implementations: Towards modularity, Symposium on Biocomputing.
re-use, and maintainability. Proceedings Workshop on Ontologies for Multiagent
6