=Paper= {{Paper |id=Vol-2137/paper_21.pdf |storemode=property |title=Facets, Tiers and Gems: Ontology Patterns for the Hypernormalisation |pdfUrl=https://ceur-ws.org/Vol-2137/paper_21.pdf |volume=Vol-2137 |authors=Phillip Lord,Robert Stevens |dblpUrl=https://dblp.org/rec/conf/icbo/LordS17 }} ==Facets, Tiers and Gems: Ontology Patterns for the Hypernormalisation== https://ceur-ws.org/Vol-2137/paper_21.pdf
                       Facets, Tiers and Gems: Ontology Patterns for
                                     Hypernormalisation
                                               Phillip Lord 1∗and Robert Stevens2
                                                        1
                                                          School of Computing Science,
                                                              Newcastle University,
                                                              Newcastle-upon-Tyne
                                                         2
                                                           School of Computer Science,
                                                            University of Manchester,
                                                                   Manchester




ABSTRACT                                                                      refining concepts that form closed, covering and disjoint hierarchies.
   There are many methodologies and techniques for easing the                 Building an ontology in this way, allows the ontology developer to
task of ontology building. Here we describe the intersection of two           exploit the reasoner to build a polyhierarchy by using classes that
of these: ontology normalisation and fully programmatic ontology              define the self-standing entity in terms of the refining partitions.
development. The first of these describes a standardized organisation         Polyhierarchies are difficult to build manually, as human ontology
for an ontology, with singly inherited self-standing entities, and a          developers, no matter how good their domain knowledge, find it
number of small taxonomies of refining entities. The former are               hard to ensure all possible parents of an entity are taken into account.
described and defined in terms of the latter and used to manage               The normalisation approach uses defined classes and reasoning to
the polyhierarchy of the self-standing entities. Fully programmatic           remove this chore. Creating the tree of self-standing entities still,
development is a technique where an ontology is developed using a             however, remains as a task for the developer. The normalisation
domain-specific language within a programming language, meaning               approach can significantly increase the robustness and reduce the
that as well defining ontological entities, it is possible to add arbitrary   work of manual maintenance (Wroe et al., 2003). In this latter form,
patterns or new syntax within the same environment. We describe               ontological normalisation has been widely, if implicitly, used.
how new patterns can be used to enable a new style of ontology                   While the term “ontology normalisation” has been borrowed,
development that we call hypernormalisation.                                  somewhat metaphorically, from database engineering, the process
                                                                              of building ontologies using a set of standard design patterns
                                                                              has a rather more direct relationship to the software engineering
1      INTRODUCTION
                                                                              equivalent. By reusing a standard set of patterns, it is possible to
Building ontologies is a difficult and time-consuming business for a          build an ontology both rapidly, and consistently. This has manifested
number of reasons: from an abstract point-of-view knowledge about             itself in a number of different ways, with a number of different tools,
the domain can be difficult to gather, to understand and to represent         such as TermGenie (Dietze et al., 2014), or Populus (Jupp et al.,
ontologically; more, immediately, ontologies, especially those with           2011) which can generate ontologies according to a pattern.
a complex representation, can be taxing to describe and define                   We have previously described a fully programmatic methodology
consistently, to update, expand or change when that representation            for ontology development (Lord, 2013), using the Tawny-OWL
needs to change.                                                              environment. This is built around the programming language
   There have been numerous attempts to simplify and clarify                  Clojure and enables the ontology to take advantage of all the features
this process including: the development of methodologies such as              of a programming language and its environment, including unit
OntoClean that defines a set of meta-properties that can inform               testing (Warrender and Lord, 2015), build, evaluation and, of course,
ontological modelling (Guarino and Welty, 2002); upper ontologies             pattern-driven development by simple use of functions (Warrender
such as DOLCE or BFO (Grenon et al., 2004) that provide a                     and Lord, 2013). With respect to patterns, this environment has
pre-made upper classification.                                                several advantages. First, and unlike tools such as Populous and
   Another approach that can leverage both of these techniques is             OPPL (Egana Aranguren et al., 2009), patterns are developed in
ontology normalisation (Rector, 2002). Originally intended as a               the same environment and syntax as simple ontology concepts; it
mechanism for “untangling” existing hierarchies or classifications            is, therefore, as easy to define a pattern as it is to define a class.
being reused as the basis for an ontology, it also has significant use        Second, being based on Clojure, a language which is homoiconic
as a pattern for building ontologies de novo.                                 and has very little syntax of its own, it is possible to build arbitrary
   Broadly, a normalised ontology is defined using a skeleton that is         syntactic constructions to represent patterns in a way that is both
a strict tree (i.e. not a acyclic graph) of concepts differentiated using     convenient and attractive to the developer.
an inheritance (i.e. not a partonomy) relationship. These are further            In this paper, we describe an extension of the normalisation
split into: a set of self-standing entities in which children are disjoint    technique that we call hypernormalisation. This technique is
from each other, but do not cover the parent, and partitioning or             typified by the (near or complete) absence of asserted hierarchy
                                                                              among the self-standing entities. We describe how this allows
∗ To       whom      correspondence        should       be      addressed:    construction of an exemplar ontology of amino-acids (Stevens and
phillip.lord@newcastle.ac.uk                                                  Lord, 2012). We then move on to describe recent developments


                                                                                                                                                    1
Lord and Stevens




                                                               Top Level




                                      Self-Standing Entities                   Refining Type




                     Body Substance        Person      Role                        Value Types




       Protein   Steroid     Organic Ion      Care Role         Patient Role        Age Type       Sex




                               Doctor Role      Nurse Role             Adult          Child        Male     Female




       Fig. 1. A normalised ontology slightly modified from Rector (2002). The graph does not necessarily reflect subsumption, see text for details.


in the Tawny-OWL environment, including the definition of two                             the self-standing entities are split into only three sets: the amino-
new design patterns, the tier and the facet, and one syntactic                            acids themselves (e.g. Alanine); a (very large) set of defined
abstraction, the gem, can be used to enable hypernormalised                               classes describing the refined types of amino-acid (e.g. Small
ontology development. Finally, we discuss the application of this                         Neutral Amino Acid); and, finally, the single class Amino Acid.
approach to other ontologies.                                                             Or, stated alternatively, it contains no skeleton hierarchy at
                                                                                          all, and all relationships between the self-standing classes are
2   HYPERNORMALISATION AND AMINO ACIDS                                                    arrived at through reasoning. This is particularly relevant for
                                                                                          the amino acid ontology as it contains over 500 defined classes,
Normalisation is a methodology that aims to disentangle an
                                                                                          with subsumption relationships to the amino acids and between
ontological structure, in the process managing its maintainability,
                                                                                          themselves. Maintaining this form of ontology by hand would be
utility and expressivity of the ontology generated. To achieve
                                                                                          impractical.
this, the ontology is split into two main hierarchies: self-standing
                                                                                             We call this style of ontology development hypernormalised.
entities and refining types, see Figure 2 for an example. The
                                                                                          We believe that it is a natural extension of normalisation. Rector
self-standing hierarchy contains entities with a central hierarchy
                                                                                          notes, for example, that the choice of aspect to form the skeleton
or skeleton. In this part of the ontology, we would expect that
                                                                                          is “to some degree arbitrary”, but that they should be rigid (after
hierarchy contains levels that are not-exhaustive – that is the
                                                                                          OntoClean (Guarino and Welty, 2002)) and pragmatically stable
children do not cover the parents, and parents are not closed to new
                                                                                          (i.e. unlikely to change during the evolution of the ontology). Both
children. This is contrasted by the refining hierarchy that consists of
                                                                                          of these are, however, true for all the refining concepts in the amino-
classes that are exhaustive; in many cases, children will be non-
                                                                                          acids. In short, not only is the choice of skeleton arbitrary it is
overlapping and, therefore, disjoint. This is not to say that the
                                                                                          actually unnecessary and brings no further utility to the ontology
refining types hierarchy are necessarily complete: in Figure 2, for
                                                                                          than that which can be achieved by use of reasoning.
example, the representation of Sex is too simple for many medical
                                                                                             We note that the distinction between normalisation and
uses, but might be sufficient for a customer relations system. In
                                                                                          hypernormalisation is not absolute, but one of degree; we are simply
general, the self-standing entities will be defined in terms of the
                                                                                          describing the tendency toward an ontology with an flat asserted
refining types, while polyhierarchical relationships between the
                                                                                          hierarchy.
self-standing entities will be determined through use of a reasoner.
                                                                                             Having introduced the notions of a hypernormalised ontology, we
   This form of ontology development is quite different from an
                                                                                          next consider a set of new patterns in Tawny-OWL that enable this
upper ontology and agnostic to the choice of upper ontology or
                                                                                          style of ontology development.
none. While Rector (2002) suggests only that self-standing entities
and refining types should be “made clear by some mechanism”; in
OWL, it could be an upper ontological term, or an annotation.                             3      PATTERNISING AND TAWNY-OWL
   We next introduce the amino-acid, used here as an exemplar,                            The Tawny-OWL environment (Lord, 2013) and its ability to
which defines the biological amino-acids in terms of the                                  support patterns (Warrender and Lord, 2013) has been described
physiochemical properties most relevant to their biological role.                         elsewhere in detail; here, we provide a quick overview, so that the
It is a structurally interesting ontology because it is normalised,                       rest of the paper is clear. Tawny-OWL is implemented as a DSL
with a clear and clean separation between the self-standing entities                      (domain-specific language) in Clojure, which is a Lisp-like language
and the five refining concepts. It is rather more than this, though;                      implemented in Java, and running on the Java virtual machine.


2
                                                                                                                                     Facets, Tiers and Gems




                                                                                                      Top Level




                                                                 Self-Standing Entities                              Refining Type




                               Defined Class     Amino Acid      Alanine      Arginine      18 more         Size   Charge     Hydro.       Polarity   SideChain




       Small AA      S. Neutral AA     S. N. Aliphatic AA     500 more                        Tiny         Small   Large




       Fig. 2. A hypernormalised ontology representing the amino-acids using the same terminology as Figure 2. Some labels have been abbreviated.


Tawny-OWL itself wraps the OWL API (Horridge and Bechhofer,                        define this pattern in the same environment, or side-by-side in the
2011); this is the same library that underpins Protg, and from it,                 same file as a simple class definition; with Tawny-OWL it is as easy
Tawny-OWL gains much of its functionality. Simple sections of the                  to define a class, as to define and use a new pattern. Ontologies
ontology can be generated using a syntax based on a “lispified”                    such as the Karyotype ontology make extensive use of this facility
version of Manchester OWL Notation; for example, the following                     moving freely between ontology and pattern definitions, as well as
code:                                                                              literal data structures, utility functions and unit tests (Warrender and
                                                                                   Lord, 2013).
(defclass A                                                                           Tawny-OWL is now a mature and used software product; the first
  :super B)                                                                        alpha release of Tawny-OWL was in Nov 2012, first full release,
                                                                                   Nov 2013, followed by four point releases to 2016. This paper
  This declares a new class A that has the pre-existing class B                    describes mostly the upcoming v2.0 release, although some of the
as a superclass 1 which in Manchester OWL notation would be                        features described were available in earlier versions.
expressed as:

Class: o:A                                                                         4      THE VALUE PARTITION
   SubClassOf:                                                                     A common pattern for building a normalised ontology is called the
        o:B                                                                        value partition. This pattern (Rector, 2005) addresses the problem
                                                                                   of the ontological modelling of a continuous range. For example, in
   This code is entirely valid Clojure and can be evaluated in any                 modelling the amino-acids, we can consider the concept of Size;
Clojure environment, such as CIDER/Emacs or Cursive/IntelliJ. It                   this could be described directly using the molecular weight of the
is also possible to define new patterns: for example the following                 amino-acid. However, for the purpose of the amino-acids, it is both
pattern definition:                                                                easy and general practice to split size into three categories: tiny,
                                                                                   small and large. In Tawny-OWL, this can be achieved straight-
(defn some-only [property & clazzes]                                               forwardly using the defpartition function3 .
  (list (some property clazzes)
        (only property (or clazzes))))                                             (defpartition Size
                                                                                     [Tiny Small Large]
   defines the some-only pattern which generates a set of                            :domain AminoAcid
existential restrictions and one universal with the union of the                     :super PhysioChemicalProperty)
existential fillers as its filler, which implements the ontological
closure pattern. This is a function definition in Clojure terms: defn
                                                                                     Axiomatically, this expands into: a class Size; three subclasses,
introduces the function, property & clazzes is the argument
                                                                                   Tiny, Small and Large; and, a property hasSize. The
list, some, only and or are functions provided by Tawny-OWL
and list returns, prosaically, a list2 . Critically, it is possible to
                                                                                   3 For those with knowledge of Lisp, this is actually a macro; the main
1  See Lord (2014) an explanation of why :super is used rather than                implementation is in the value-partition function. Tawny-OWL
:subclass.                                                                         provides support for implementing syntactic macros whose function is
2 The function shown here is a slightly simplified version of one provided         simply to allow the use of bare symbols. For those without knowledge of
in Tawny-OWL.                                                                      Lisp, the distinction is not important!



                                                                                                                                                            3
Lord and Stevens



property is functional, has range of Size and domain of                    The use of :suffix true causes a simple change to the
AminoAcid. Expanded, this would be expressed as follows4 :               naming of the entities: Positive will become PositiveCharge
                                                                         which would be expanded as follows:
Class: o:Large
    SubClassOf:                                                          Class: o:PositiveCharge
        o:Size                                                               SubClassOf:
                                                                                 o:Charge
Class: o:Size
    EquivalentTo:                                                           Other names are modified equivalently. By default, this will
        o:Large or o:Small or o:Tiny                                     manifest both when referring to the class in the Tawny-OWL
                                                                         environment, in the IRI of the concept when serialized as OWL, and
        SubClassOf:                                                      in the value of an annotation on the concepts5 . In addition to naming,
            o:PhysioChemicalProperty                                     it is also possible to optionalise: whether or not the subclasses are
                                                                         disjoint, covering, whether the property is functional or whether it
Class: o:Small                                                           is created at all.
    SubClassOf:                                                             The tier is a more general pattern than the value-partition; in fact,
        o:Size                                                           in the current version of Tawny-OWL, the latter is defined in terms
                                                                         of the former.
Class: o:Tiny
    SubClassOf:                                                          6    THE FACET
        o:Size
                                                                         Both the value partition and tier introduce a new object property
                                                                         named after the tier, and with a range limited to the classes defined
DisjointClasses:
                                                                         within the tier. The converse is also true; where we use one of the
    o:Large,o:Small,o:Tiny
                                                                         tier classes, such as PositiveCharge it is most likely that we
                                                                         wish to use it with the hasCharge property defined as part. Taken
   The subclasses are disjoint and cover the parent. Following the       together, we describe the combination of classes and a property as a
terminology from Rector (2002), the value partition is useful for        facet. Facets are a well known technique, first proposed in a library
defining partitioning or refining concepts.                              classification (the Colon Classification (Ranganathan, 1933), named
                                                                         after the use of “:” as a separator). They are now common-place as
5     THE TIER                                                           seen with facetted browsers used by many websites for navigation
The value partition is a pattern aimed at a specific purpose –           of complex product catalogues.
segmenting a continuous range. In practice, though, we have found           Tawny-OWL provides explicit support for facets, allowing the
that the axiomatization of this pattern is more generally useful. For    association of a property and a set of classes, as demonstrated by
example, considering the amino-acid ontology, it is natural to model     the following code:
the chemistry of the side-chain as such:
                                                                         (as-facet
(defpartition SideChainStructure                                          hasCharge
  [Aromatic Aliphatic]
  :domain AminoAcid                                                          Positive Neutral Negative)
  :super PhysicoChemicalProperty)
                                                                            The practical implication of this is that we can now use the
   While this is intuitive, ontologically, SideChainStructure            facet function to return an existential restriction providing just a
is actually of a very different form from Size, as it does not reflect   class. We can express this programmatically; for example, we might
a spectrum. Either the side-chain contains a benzene ring, making        use the assert function provided by Clojure’s unit test framework.
it aromatic, or it does not. This form of partition was also noted       (assert
in Rector (2002) which includes the classes Male and Female               (= (some hasCharge Positive)
which is not a spectrum, at least in this simplified representation.         (facet Positive)))
We introduce here, therefore, the more general notion of the tier:
a small set of concepts in a one-deep hierarchy. The tier function
                                                                           By itself, this ability is only slightly more succinct. However,
supports a range of options:
                                                                         when used with multiple facetted classes, the advantages become
(deftier Charge                                                          considerably clearer, as can be shown by the following assertion.
   [Positive Neutral Negative]                                           (assert
   :domain AminoAcid                                                      (= (list (some hasCharge
   :super PhysioChemicalProperty
   :suffix true)
                                                                         5  The duplication between the annotation and the IRI fragment is there
                                                                         because IRI schemes such as numeric style OBO IDs; annotations have been
4   Tawny-OWL also adds annotations which have been elided               elided for brevity


4
                                                                                                                              Facets, Tiers and Gems



                  Neutral)                                              explicit in the OWL serialization. Tawny-OWL actually uses
            (some hasHydrophobicity                                     these annotations internally, for example, to enable the facet
                  Hydrophobic)                                          functionality by providing a relationship between the classes and
            (some hasPolarity                                           the appropriate object property. This is a strictly an implementation
                  NonPolar)                                             detail and could have been achieved without annotations; however,
            (some hasSideChainStructure                                 we believe that it shows the value of having this knowledge explicit
                  Aliphatic)                                            in OWL.
            (some hasSize
                  Tiny))                                                9    DISCUSSION
      (facet Neutral Hydrophobic NonPolar
             Aliphatic Tiny)))                                          In this paper, we describe how we have used Tawny-OWL to provide
                                                                        higher-level patterns which can be applied to ontology development.
                                                                        The patterns provide both functionality and syntactic abstraction
   In addition to succinctness, this pattern also reduces the risk of
                                                                        over the underlying OWL implementation. In the process, they
errors; a class such as Tiny will always be used with its correct
                                                                        enable the easy and accurate construction of ontologies.
property. Without the use of facets, the ontology developer must
                                                                           More specifically, we demonstrate two new patterns: the tier and
achieve this by hand. It would also be possible to detect the error
                                                                        the facet. The tier is an extension of the existing value partition
using reasoning, although this will only succeed if appropriate range
                                                                        pattern and can be used for the generation of many small hierarchies
and disjoint restrictions are in the ontology. The defpartition
                                                                        that can be used as refining properties. The facet borrows from the
and deftier functions, of course, both add these range and
                                                                        library sciences notion of a facetted classification, and is used to
disjoint restrictions and declare their classes as facets of their
                                                                        associate a set of classes with a specific set of values. This form of
properties.
                                                                        classification is very common in the web; the majority of web stores,
                                                                        for example, offer facetted browsing, often with the facets changing
7   THE GEM                                                             for different subsections of the catalogue.
Finally, we define the gem that provides a syntactic abstraction for       Taken together, these two patterns enable a new form of ontology
a class composed entirely or mainly from facets. Following the          development, hypernormalisation, which is an extreme form of
terminology from Rector (2002), this abstraction would be useful        normalisation. In this form of normalisation, we do away with
mostly for self-standing concepts. For example, we could define the     the creation of a tree of self-standing entities and instead rely
amino acid alanine using the following defgem statement.                on the reasoner to build all the hierarchy. As well as making
                                                                        the ontologist’s task easier, it makes the characteristic that would
(defgem Alanine                                                         have been used to create the tree of self-standing entities explicit
  :comment "An amino acid with a single                                 in the form of a refining characteristic. Here, we have described
methyl group as a side-chain."                                          the application of this methodology to the exemplar amino-acid
  :facet Neutral Hydrophobic NonPolar                                   ontology. Of course, it is dangerous to extrapolate to generality from
  Aliphatic Tiny)                                                       an exemplar, but we have also started to apply hypernormalisation
                                                                        to ontologies of other, more real, domains including clouds (in
   The other amino-acids can be likewise defined as a series of         the meterological sense), cell lines and a reworking of the Gene
gems. In fact, the amino acids are so regular, all having the same      Ontology. The tier has been made generic; it does not require, for
five facets, that we use a further syntactic abstract specific to       example, that all refining types are closed (i.e. all possibilities are
the amino-acid ontology – a form of pattern that we describe as         known in advance) nor disjoint.
localized (Warrender, 2015). The gem represents generalised syntax         Clearly, not all forms of ontology will naturally be represented
useful for developing any ontology.                                     in a hypernormalised form. For example, the Karyotype
                                                                        ontology (Warrender and Lord, 2013) is far from this form; here, we
8   ON ANNOTATION                                                       define the self-standing concepts and then use reasoning over a set of
We have previously discussed the relationship between a design          defined classes which effectively operate as facets (Warrender and
methodology such as normalisation and the use of an upper               Lord, 2015). However, the popularity of the facetted browsers shows
ontology. The Tawny-OWL patterns described here are all                 that is possible to use this form of classification in many areas. We
orthogonal and agnostic to the choice of an upper ontology or to        believe that the introduction of the concept of hypernormalisation
none. They do not place their entities in any particular part of the    and the implementation of it in Tawny-OWL could have significant
class hierarchy nor define classes outside of those required for the    implications for the future development of ontologies.
domain ontology, although they could be easily extended to do so
should the ontology developer require.                                  REFERENCES
   However, we agree with Rector (2002) that the use of patterns        Dietze, H., Berardini, T. Z., Foulger, R. E., Hill, D. P., Lomax, J., OsumiSutherland,
should “made clear” and be explicit within the ontology. For               D., Roncaglia, P., and Mungall, C. J. (2014). Termgenie - a web application for
this reason, all of the patterns described here also make use of           pattern-based ontology class generation. Journal of Biomedical Semantics, 5(1), 48.
annotations, using annotation properties defined using its own          Egana Aranguren, M., Stevens, R., and Antezana, E. (2009). Transforming
                                                                           the axiomisation of ontologies: The ontology pre-processor language. Nature
internal annotation ontology. For example, all entities generated          Precedings.
as a result of a pattern such as deftier are explicitly annotated       Grenon, P., Smith, B., and Goldberg, L. (2004). Biodynamic ontology: applying BFO
as such. This means that the use of these patterns is (informally)         in the biomedical domain. Stud Health Technol Inform, 102, 20–38.



                                                                                                                                                            5
Lord and Stevens



Guarino, N. and Welty, C. (2002). Evaluating ontological decisions with ontoclean.              Systems (OMAS) in conjunction with European Knowledge Acquisition Workshops.
   Commun. ACM, 45(2), 61–65.                                                                   Siguenza, Spain.
Horridge, M. and Bechhofer, S. (2011). The OWL API: A Java API for OWL                       Stevens, R. and Lord, P. (2012). Semantic publishing of knowledge about amino acids.
   Ontologies. Semantic Web Journal, 2.                                                         http://ceur-ws.org/Vol-903/paper-06.pdf.
Jupp, S., Horridge, M., Iannone, L., Klein, J., Owen, S., Schanstra, J., Wolstencroft, K.,   Warrender, J. (2015). The Consistent Representation of Scientific Knowledge:
   and Stevens, R. (2011). Populous: a tool for building owl ontologies from templates.         Investigations into the Ontology of Karyotypes and Mitochondria. Ph.D. thesis,
   BMC Bioinformatics, 13(Suppl 1), S5.                                                         School of Computing Science, Newcastle University.
Lord, P. (2013). The Semantic Web takes Wing: Programming Ontologies with Tawny-             Warrender, J. and Lord, P. (2013). A pattern-driven approach to biomedical ontology
   OWL. OWLED 2013.                                                                             engineering. SWAT4LS 2013.
Lord, P. (2014). Manchester syntax is a bit backward. http://www.russet.org.                 Warrender, J. D. and Lord, P. (2013). The Karyotype Ontology: a computational
   uk/blog/2985.                                                                                representation for human cytogenetic patterns. Bio-Ontologies 2013.
Ranganathan, S. (1933). Colon Classification.                                                Warrender, J. D. and Lord, P. (2015). How, What and Why to test an ontology.
Rector, A. (2005). Representing specified values in owl: “value partitions” and “value       Wroe, C., Stevens, R., Goble, C., and Ashburner, M. (2003). A methodology to
   sets”. W3C Working Group Note.                                                               migrate the gene ontology to a description logic environment using daml+oil. Pacific
Rector, A. L. (2002). Normalisation of ontology implementations: Towards modularity,         Symposium on Biocomputing.
   re-use, and maintainability. Proceedings Workshop on Ontologies for Multiagent




6