<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Reconciling ontology definitions using the Ontology Pattern Reconciliation Workbench and the DOSDP framework</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicolas Matentzoglu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Osumi-Sutherland</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>European Bioinformatics Institute (EMBL-EBI)</institution>
          ,
          <addr-line>Hinxton</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <fpage>7</fpage>
      <lpage>10</lpage>
      <abstract>
        <p>Many bio-ontologies use formal, logical definitions to automate multiple inheritance classification and drive cross-ontology inference. This requires the use of standardised design patterns: shared patterns of axiomatisation using common reference ontologies. Developing, managing and implementing a suitably consistent set of design patterns can be challenging. In many phenotype ontologies, for example, a large proportion of class terms have formal definitions following a general framework, known as entity/quality (EQ), with common relations and reference ontologies. Despite this, the formal definitions used are often too divergent to drive classification and cross-ontology inference. Here we present software tools for improving and managing formalisation using design patterns. The Ontology Pattern Reconciliation Workbench helps users prioritise patterns for reconciliation between two related ontologies based on the impact pattern reconciliation will have on cross-ontology mapping. An extension to the ontology starter kit provides a practical workflow for developing ontologies using formally specified design patterns.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>BACKGROUND</title>
      <p>
        Pattern-based ontology development is increasingly used to
automate and improve classification within ontologies
        <xref ref-type="bibr" rid="ref5">(OsumiSutherland et al., 2017; Rocca-Serra et al., 2011)</xref>
        . It also has
great potential for improving cross-ontology mappings, for example
between different species-specific phenotype ontologies. Some
degree of standardisation in phenotype ontologies has been achieved
through the widespread use of entity/quality-based (EQ) logical
definitions
        <xref ref-type="bibr" rid="ref2">(Mungall et al., 2010)</xref>
        such as the Human Phenotype
Ontology (HP), Mammalian Phenotype Ontology (MP), Drosophila
Phenotype Ontology
        <xref ref-type="bibr" rid="ref3">(Osumi-Sutherland et al., 2013)</xref>
        , Fission Yeast
Phenotype Ontology
        <xref ref-type="bibr" rid="ref1">(Harris et al., 2013)</xref>
        . EQ definitions reference
an entity, typically an independent continuant such as ‘finger’ or
‘cell’ or an occurrent such as ‘gastrulation’, and a quality that
inheres in that entity, such as ‘long’ or ‘increased pressure’.
      </p>
      <p>While EQ provides a basic framework for standardising
phenotype definitions, it is flexible enough that groups using it
independently tend to come up with divergent, often incompatible
definitions for similar phenotypes. HP, for example, defines
‘unilateral deafness’ as follows:</p>
      <p>Unilateral deafness (HP:0009900)
‘has part’ some (‘lacking processual parts’ and
(‘towards’ some ‘sensory perception of sound’) and
‘has modifier’ some ‘abnormal’ and ‘has modifier’
some ‘unilateral’)
The phenotype is defined in terms of a PATO quality ‘lacking
processual parts’ and there is no associated entity that bears it. With
respect to the resulting classification, this is no problem, because
HP:‘Unilateral deafness’ is inferred to be a subclass of ‘abnormal
ear physiology’, which is defined to ‘inhere in’ some ear. (Note that,
unless ‘has part’ is made functional, it is not the ‘lacking processual
parts’ quality that ‘inheres in’ the ear.) The corresponding definition
in MP is as follows:
unilateral deafness (MP:0004699)
‘has part’ some (‘unilateral’ and ‘inheres in’ some
‘deafness’ and ‘has modifier’ some ‘abnormal’)</p>
      <p>Here, we have the PATO ‘spatial pattern’ quality of ‘unilateral’
being the central quality by which the phenotype is defined. This
quality is being held by the entity ‘deafness’ (a phenotype rather
than an occurrent or independent continuant).</p>
      <p>Not all definitions diverge to such a high degree. An example of a
class that has only slightly different definitions, ‘hypertension’, can
be seen in Figure 1. In both definitions, ‘increased pressure’ is the
abnormal quality (modified by ‘abnormal’ and ‘chronic’) and the
relevant entity (the ‘bearer’ of the quality) is ‘blood’.</p>
      <p>However, rather than using separate clauses for the anatomical
entity ‘blood’ and ‘part of’ some ‘arterial system’, the HP
definition only defines a single entity: ‘blood’, restricted to be
‘part of’ some ‘blood vessel’. These hypertension definitions may
not look significantly different, but they nevertheless result in
different inferred classifications. For example, MP:‘hypertension’
is classified as ‘increased systemic arterial blood pressure’, while
HP:‘hypertension’ is not. Both, however, are classified as ‘abnormal
systemic arterial blood pressure’.</p>
      <p>To reach an appropriate level of semantic integration, definitions
such as the ones presented here should be reconciled. In
the following, we introduce a prototype of the Ontology
Pattern Reconciliation Workbench, a tool suite that enables
the identification, collection and exploration of reconciliation
candidates across multiple ontologies. We also describe extensions
to the the ontology starter kit that make it easy for ontology
developers to incorporate DOSDP-based definition generation into
the general release workflow of their ontologies.
2</p>
      <p>THE ONTOLOGY PATTERN RECONCILIATION
WORKBENCH
A reconciliation candidate is a set of terms that should
correspond to a common design pattern, for example fHP:0009900,
MP:0004699g (terms referring to unilateral deafness) or fHP:0000822,
MP:0000231g (terms referring to hypertension). The Ontology
Matentzoglu et al
Pattern Reconciliation Workbench caters to the following use cases
across multiple ontologies: The user wants to
search and collect reconciliation candidates
explore collected reconciliation candidates
test the impact of a set of externally specified or generated
patterns
monitor the reconciliation degree across (parts of the) ontologies
To realise these use cases, the workbench comes with three main
views. The ‘CandIdent’ view allows users to select any number
of ontologies, search them, individually or at once, for syntactic
patterns and group them into reconciliation candidates (search and
collect). For example, a user might search for patterns relating to
mating behaviour across MP, HP and DPO, and group them together
to form a reconciliation candidate. Candidates can be exported for
use in future sessions.</p>
      <p>The ‘Quick impact’ view allows the user to select a set of
ontologies, classify them against a set of patterns and browse the
resulting class hierarchy (test pattern impact). Patterns can be sorted
by impact, which allows the user to get a quick sense of which
patterns might have instances across which ontologies.</p>
      <p>The ‘Reconciliation’ view allows users to explore reconciliation
candidates based on a pre-defined mapping. Reconciliation
candidates can be browsed, sorted and filtered through a table or
through the tree browser view, which presents the merged class
hierarchy of the selected input ontologies. By browsing down the
class hierarchy, the user can inspect particular ‘branches’ of the
merged ontology (e.g. sub-classes of ‘behavioural phenotype’), their
level of semantic integration and their corresponding reconciliation
candidates. Individual reconciliation candidates can be be inspected
along with the class hierarchies of the terms they related to, see
Figure 1.
3</p>
      <p>
        PATTERN DEVELOPMENT AND INSTANTIATION
The ontology starter kit (OSK) provides a standard toolkit and
repository structure for developing ontologies. We have extended
the OSK1 to incorporate pattern-based ontology development using
the DOSDP framework
        <xref ref-type="bibr" rid="ref4">(Osumi-Sutherland et al., 2017)</xref>
        . Once
reconciliation candidates are identified (previous section), experts
of the phenotype ontologies involved try to identify a common
pattern for representing definitions, for example ‘has part’ some
(‘increased amount’ and ‘inheres in’ some $A and ‘towards’
some $B and ‘has modifier’ some ‘abnormal’), where $A and
$B are instances that can be instantiated by concrete concept
from, for example, UBERON and the Gene Ontology (GO). This
pattern can then be represented as a DOSDP. Our extended OSK
incorporates pattern generation into the general release workflow
by first compiling all patterns and their instance files into OWL
definitions and then importing or merging them into the main release
file.
      </p>
      <p>The workbench is available on GitHub 2 and as a docker
container.3</p>
    </sec>
    <sec id="sec-2">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work is funded by a NIH Office of the Director Grant
(5R24OD011883) to the Monarch Initiative.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lock</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bhler</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliver</surname>
            ,
            <given-names>S. G.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Wood</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>FYPO: the fission yeast phenotype ontology</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>29</volume>
          (
          <issue>13</issue>
          ),
          <fpage>1671</fpage>
          -
          <lpage>1678</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G. V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S. E.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Integrating phenotype ontologies across multiple species</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ),
          <fpage>R2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Osumi-Sutherland</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marygold</surname>
            ,
            <given-names>S. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Millburn</surname>
            ,
            <given-names>G. H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McQuilton</surname>
            ,
            <given-names>P. A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ponting</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stefancsik</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Falls</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>N. H.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G. V.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>The Drosophila phenotype ontology</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ),
          <fpage>30</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Osumi-Sutherland</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courtot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Dead simple OWL design patterns</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>8</volume>
          ,
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Rocca-Serra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruttenberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Connor</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Whetzel</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schober</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greenbaum</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courtot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brinkman</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sansone</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scheuermann</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Peters</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Overcoming the ontology enrichment bottleneck with quick term templates</article-title>
          .
          <source>Applied ontology</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ),
          <fpage>1322</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>1 https://bit.ly/2IYSnZX 2 https://bit.ly/2L4ejjm 3 https://bit.ly/2kC2eX6</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>