<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Aligning the Human Phenotype and Mammalian Phenotype Ontology using Dead Simple Ontology Design Patterns</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicole Vasilevsky</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>James P. Balhoff</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christopher J. Mungall</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Osumi-Sutherland</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastian Kohler</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Susan Bello</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cynthia Smith</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Peter Robinson</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Melissa Haendel</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charité - Universitätsmedizin Berlin</institution>
          ,
          <addr-line>10117 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory</institution>
          ,
          <addr-line>1 Cyclotron Road, MS977 Berkeley, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Ontology Development Group, OHSU Library, Oregon Health &amp; Science University</institution>
          ,
          <addr-line>3181 SW Sam Jackson Park Road, Portland, OR</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Renaissance Computing Institute, University of North Carolina</institution>
          ,
          <addr-line>100 Europa Drive, Suite 540, Chapel Hill, NC</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>School of Informatics, The University of Edinburgh</institution>
          ,
          <addr-line>10 Crichton Street, Edinburgh, Scotland</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>The Jackson Laboratory for Genomic Medicine</institution>
          ,
          <addr-line>10 Discovery Drive, Farmington, CT</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>The Jackson Laboratory</institution>
          ,
          <addr-line>600 Main St, Bar Harbor, ME</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Human Phenotype Ontology (HP) and Mammalian Phenotype (MP) Ontology represent information about abnormal phenotypes encountered in human diseases and mammalian organisms, respectively. It is a goal to codevelop these ontologies and align the terminology and logical axioms to increase interoperability of the two ontologies. Towards this end, we have worked to develop consistent design patterns for commonly used types of classes, such as morphological phenotypes or abnormal anatomical structures. We are currently implementing 'dead simple ontology design patterns' (DOSDP), which are design patterns that can be specified in a YAML text file and can be used to generate new terms, documentation, retrofit old terms, and allows for reuse of patterns. This paper describes the development and implementation of the DOSDP in the HP and MP.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Genotype and phenotype information are commonly used
for characterization and diagnosis of human diseases. The
Human Phenotype (HP) Ontology
(http://www.humanphenotype-ontology.org/) was developed as a standardized
vocabulary describing phenotypic abnormalities
encountered in human diseases (Köhler et al., 2017).
Similarly, the Mammalian Phenotype (MP) Ontology
(http://www.informatics.jax.org/vocab/mp_ontology)
represents phenotypes encountered in mammalian
organisms that are used as models of human disease
        <xref ref-type="bibr" rid="ref2">(Smith
et al., 2004)</xref>
        . The HP and MP have differing and
overlapping use cases and needs. One shared use case is the
Monarch Initiative (www.monarchinitiative.org), which
aims to use ontologies, such as the HP and MP, and
semantic technologies to aggregate data to support disease
diagnostics, for common and rare diseases, where use of
large scale integrated data can inform disease diagnosis and
decisions. To allow for integration of diverse data that is
annotated to disparate ontologies, we developed unifying
ontologies such as UberPheno (
        <xref ref-type="bibr" rid="ref3">Koehler 2013</xref>
        ), which
integrates phenotype ontologies, including HP and MP,
using OWL Definitions
        <xref ref-type="bibr" rid="ref4">(Mungall 2010)</xref>
        . We are working to
align the OWL design patterns in these two ontologies to
make them to be more interoperable, and allow reasoning
across the two ontologies. We need a simple, light-weight
standard for specifying these design patterns that can then
be used for generating documentation, generating new terms
and retrofitting old ones. An ideal solution should be
readable and editable by anyone with a basic knowledge of
OWL and the ability to read Manchester syntax. It must also
be easy to use programmatically without the need for
custom parsers—i.e. it should follow some existing data
exchange standard. Human readability and editability
requires that Manchester syntax be written using labels, but
sustainability and consistency checking requires that the
pattern make use of IDs. The approach we used was
developed by David Osumi-Sutherland and is called “Dead
Simple Ontology Design Patterns” (DOSDP)
        <xref ref-type="bibr" rid="ref5">(https://github.com/dosumis/dead_simple_owl_design_patte
rns, Osumi-Sutherland 2017)</xref>
        .
1
      </p>
      <sec id="sec-1-1">
        <title>METHODS</title>
        <p>To promote the alignment and consistency across the HP
and MP ontologies (and with a view to extension to other
ontologies), we recently created DOSDP templates for a
number of common phenotype ontology patterns. These are
found in the UberPheno ontology repository
(https://github.com/obophenotype/upheno). Each pattern is
represented as YAML conforming to the DOSDP standard.
Patterns were developed for commonly used classes in HP
and MP, such as morphological abnormalities (see pattern:
abnormalMorphology.yaml), for example, ‘Abnormal heart
morphology’ (HP: 0001627 and MP:0000266) or a
decreased level of a molecular entity in a location (see
pattern:
decreasedLevelOfMolecularEntityInLocation.yaml), for
example, HP_0002902 ‘Hyponatremia’ and MP_0005634
‘decreased circulating sodium level’.</p>
        <p>
          The DPs were generated through manual inspection of
the ontologies combined with knowledge of the ontology
curators. In many cases these patterns were already implicit
if not formally documented
          <xref ref-type="bibr" rid="ref4">(see Mungall 2010)</xref>
          .
        </p>
        <p>Once we generated the DPs, we used dosdp-tools
(https://github.com/INCATools/dosdp-tools) to query OWL
definitions in the ontology to determine which classes are
defined according to which pattern.
2</p>
      </sec>
      <sec id="sec-1-2">
        <title>RESULTS</title>
        <p>To date, 43 patterns were created for UberPheno, which are
available here
(https://github.com/obophenotype/upheno/tree/master/src/pa
tterns). These DOSDPs can be used to generate new terms,
as the patterns can specify the label, synonyms (exact,
broad, narrow, and related), the text definition, and the
logical definition. An example pattern is displayed in Figure
1.
pattern_name: abnormal
classes:
quality: PATO:0000001
abnormal: PATO:0000460</p>
        <p>Thing: owl:Thing
relations:
inheres_in: RO:0000052
qualifier: RO:0002573
has_part: BFO:0000051
vars:</p>
        <p>entity: Thing
name:
text: "abnormal %s"
vars:</p>
        <p>- entity
annotations:
- annotationProperty: oio:hasExactSynonym
text: "abnormality of %s"
vars:
- entity
def:
text: "Abnormality of %s."
vars:</p>
        <p>- entity
equivalentTo:</p>
        <p>text: "'has_part' some ('quality' and
('inheres_in' some %s) and ('qualifier' some
'abnormal'))"
vars:</p>
        <p>- entity
An initial analysis using dosdp-tools of the HP and MP was
performed to identify the number of terms that currently
match the design patterns, and the number of terms not
matching any pattern. As shown in Table 1, in the HP, 47%
of the terms have logical definitions, and 72% of the terms
in the MP have logical definitions. 12% of the terms in the
HP currently match a design pattern; terms with a logical
definition that don’t match any pattern yet defined make up
35% of all terms in HP. In the MP, 39% of the terms match
a design pattern and 33% of terms have a logical definition
but do not match a defined pattern.</p>
        <p>Human Phenotype Ontology
Total number of terms
Total terms matching a pattern
Total terms not matching a pattern
Total terms that have logical definitions
Total terms that have logical definitions</p>
        <p>but don't match a pattern
Total terms matching a basic EQ pattern
Total terms that have a logical definition
and don't match a defined pattern but
do match basic EQ
Total terms matching IIPO EQ pattern
Total terms that have a logical definition
and don't match a defined pattern but
do match IIPO EQ
Mammalian Phenotype Ontology
Total number of terms
Total terms matching a pattern
Total terms not matching a pattern
Total terms that have logical definitions
Total terms that have logical definitions</p>
        <p>but don't match a pattern
Total terms matching a basic EQ pattern
Total terms that have a logical definition
and don't match a defined pattern but
do match basic EQ
Total terms matching IIPO EQ pattern
Total terms that have a logical definition
and don't match a defined pattern but
do match IIPO EQ
Number
12.358</p>
        <p>Next, we performed a query of all the quality (PATO) terms
used in expressions matching a standard entity-quality (EQ)
pattern in HP and MP, where the expression did not match
any of the current DOSDP templates. The standard EQ
pattern is as such:
"'has_part' some (%s and ('inheres_in' some %s)
and ('has_modifier' some 'abnormal'))”
The results showed 222 and 282 PATO terms were used by
HP and MP, respectively, and we do not currently have a
DOSDP template for these patterns. While a DOSDP will
not be created for every pattern, we will aim to create
DOSDPs for frequently used quality terms, such as
PATO:0001509 functionality and PATO:0000645
hypoplastic.
3</p>
      </sec>
      <sec id="sec-1-3">
        <title>CHALLENGES</title>
        <p>While the DOSDP will be useful for aligning the design of
the logical definitions between ontologies, of course, there
are limitations. A DOSDP cannot be created and applied for
every use case. Adding some additional annotations to the
terms will still have to be done manually. For example, the
HP uses tags for layperson synonyms or abbreviations, and
these annotations may have to be added manually after the
creation of the term.</p>
      </sec>
      <sec id="sec-1-4">
        <title>4 FUTURE DIRECTIONS</title>
        <p>Future work will aim to retrofit the logical definitions for
HP and MP terms to align the logical axioms between the
two ontologies. Additionally, once these design patterns are
finalized, they can be applied other phenotype ontologies,
like the Zebrafish Anatomy Ontology. The Cell Ontology
(CL) plans to adopt these DOSDP as well. Ultimately, our
hope is these design patterns can be used to develop new
quality assurance methodology for ontologies.</p>
        <p>The DOSDP will be used by the new Table Editor that is
currently under development as part of the Monarch
Initiative. The Table Editor enables domain-specific concept
visualization, table-based editing, and to output semantically
consistent computable artifacts for use in software
applications and data analytics. The Table Editor, which is
currently under development, can be used to view and edit
ontologies, such as for generating new terms, within a
lightweight spreadsheet-style web application
(https://incatools.github.io/table-editor/settings).</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>ACKNOWLEDGEMENTS</title>
      <p>Monarch is supported generously by a NIH Office of the
Director Grant #5R24OD011883, as well as by NIH-UDP:
HHSN268201300036C, HHSN268201400093P,
NCI/Leidos #15X143. Table Editor and DOSDP tool
development is supported by NIH Grant #1U01
HG00945301.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Sebastian</given-names>
            <surname>Köhler</surname>
          </string-name>
          , Nicole A. Vasilevsky, Mark Engelstad, Erin Foster,
          <string-name>
            <surname>Julie</surname>
            <given-names>McMurry</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Ségolène</given-names>
            <surname>Aymé</surname>
          </string-name>
          , Gareth Baynam,
          <string-name>
            <surname>Susan M. Bello</surname>
          </string-name>
          , Cornelius F. Boerkoel,
          <string-name>
            <surname>Kym M. Boycott</surname>
            , Michael Brudno,
            <given-names>Orion J.</given-names>
          </string-name>
          <string-name>
            <surname>Buske</surname>
            , Patrick F. Chinnery, Valentina Cipriani,
            <given-names>Laureen E.</given-names>
          </string-name>
          <string-name>
            <surname>Connell</surname>
            ,
            <given-names>Hugh J.S.</given-names>
          </string-name>
          <string-name>
            <surname>Dawkins</surname>
          </string-name>
          , Laura E. DeMare,
          <string-name>
            <surname>Andrew D. Devereau</surname>
          </string-name>
          ,
          <string-name>
            <surname>Bert B.A. de Vries</surname>
          </string-name>
          ,
          <string-name>
            <surname>Helen</surname>
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Firth</surname>
          </string-name>
          , Kathleen Freson, Daniel Greene, Ada Hamosh, Ingo Helbig, Courtney Hum, Johanna A.
          <string-name>
            <surname>Jähn</surname>
            , Roger James, Roland Krause,
            <given-names>Stanley J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Laulederkind</surname>
            , Hanns Lochmüller,
            <given-names>Gholson J.</given-names>
          </string-name>
          <string-name>
            <surname>Lyon</surname>
          </string-name>
          , Soichi Ogishima, Annie Olry,
          <string-name>
            <surname>Willem H. Ouwehand</surname>
            , Nikolas Pontikos, Ana Rath, Franz Schaefer,
            <given-names>Richard H.</given-names>
          </string-name>
          <string-name>
            <surname>Scott</surname>
            , Michael Segal,
            <given-names>Panagiotis I. Sergouniotis</given-names>
          </string-name>
          , Richard Sever, Cynthia L.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>Volker</given-names>
          </string-name>
          <string-name>
            <surname>Straub</surname>
          </string-name>
          , Rachel Thompson, Catherine Turner, Ernest Turro,
          <string-name>
            <surname>Marijcke W.M. Veltman</surname>
            , Tom Vulliamy,
            <given-names>Jing</given-names>
          </string-name>
          <string-name>
            <surname>Yu</surname>
          </string-name>
          , Julie von Ziegenweidt, Andreas Zankl, Stephan Züchner, Tomasz Zemojtel,
          <string-name>
            <surname>Julius O.B. Jacobsen</surname>
            , Tudor Groza, Damian Smedley,
            <given-names>Christopher J.</given-names>
          </string-name>
          <string-name>
            <surname>Mungall</surname>
          </string-name>
          , Melissa Haendel,
          <string-name>
            <surname>Peter N</surname>
          </string-name>
          .
          <article-title>Robinson; The Human Phenotype Ontology in 2017</article-title>
          .
          <source>Nucleic Acids Res</source>
          <year>2017</year>
          ;
          <volume>45</volume>
          (
          <issue>D1</issue>
          ):
          <fpage>D865</fpage>
          -
          <lpage>D876</lpage>
          . doi:
          <volume>10</volume>
          .1093/nar/gkw1039
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Cynthia L Smith</surname>
          </string-name>
          ,
          <string-name>
            <surname>Carroll-Ann W Goldsmith and Janan T Eppig</surname>
          </string-name>
          .
          <article-title>The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information</article-title>
          .
          <source>Genome Biology</source>
          <year>2005</year>
          6:R7 DOI: 10.1186/gb-2004
          <source>-6-1-r7</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Köhler</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Doelken</surname>
            ,
            <given-names>S. C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruef</surname>
            ,
            <given-names>B. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Washington, N.,
          <string-name>
            <surname>Westerfield</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schofield</surname>
            <given-names>P</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smedley</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            <given-names>SE</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robinson1</surname>
            <given-names>PN</given-names>
          </string-name>
          , Mungall,
          <string-name>
            <surname>C. J.</surname>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research</article-title>
          .
          <source>F1000Research</source>
          ,
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          . http://doi.org/10.3410/f1000research.
          <fpage>2</fpage>
          -
          <lpage>30</lpage>
          .v1
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C. J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Integrating phenotype ontologies across multiple species</article-title>
          .
          <source>Genome Biology</source>
          ,
          <volume>11</volume>
          (
          <issue>1</issue>
          ), R2. http://doi.org/10.1186/gb-2010
          <source>-11-1-r2</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Osumi-Sutherland</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courtot</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Balhoff</surname>
            <given-names>JP</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            <given-names>C</given-names>
          </string-name>
          .
          <article-title>Dead simple OWL design patterns</article-title>
          .
          <source>J Biomed Semantics. 2017 Jun</source>
          <volume>5</volume>
          ;
          <issue>8</issue>
          (
          <issue>1</issue>
          ):
          <fpage>18</fpage>
          . doi:
          <volume>10</volume>
          .1186/s13326-017-0126-0.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>