<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>First steps in the logic-based assessment of post-composed phenotypic descriptions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>E. Jim´enez-Ruiz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>B. Cuenca Grau</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>R. Berlanga</string-name>
          <email>berlanga@uji.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dietrich Rebholz-Schuhmann</string-name>
          <email>rebholz@ebi.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>European Bioinformatics Institute</institution>
          ,
          <addr-line>Cambridge</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universitat Jaume I de Castell ́o</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Oxford</institution>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we present a preliminary logic-based evaluation of the integration of post-composed phenotypic descriptions with domain ontologies. The evaluation has been performed using a description logic reasoner together with scalable techniques: ontology modularization and approximations of the logical difference between ontologies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Introduction
A phenotype is defined as a basic observable characteristic of an organism. Thus,
a set of phenotypic descriptions may involve different domains and granularities
ranging from molecular to organism level.</p>
      <p>
        Phenotypic descriptions have been recently described by means of
terminological resources, with the Human Phenotype Ontology (HPO) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] being a
prominent example. The HPO ontology represents a so-called pre-composed
description: it does not provide explicit links between the phenotypic descriptions
(e.g. increased calcium concentration in blood) and the relevant entities
associated to it, such as the chemical element involved (“calcium”), the way in
which it is involved (“increased concentration”) and where it appears (“blood”).
Post-composed phenotypic descriptions intend to provide a more formal
representation to interoperate with involved entities [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and to allow more powerful
reasoning. Nevertheless, the formal representation of phenotypic descriptions is
still a challenge [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] owing to the complex nature of some phenotypes and the
lack of consensus among clinicians to describe them in a standard way.
      </p>
      <p>
        Mungall et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and Hoehndorf et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have recently proposed automatic
and semi-automatic methods to transform pre-composed phenotypic
descriptions into a description logic (DL) based post-composed representation linked
to domain ontologies. The integration of domain ontologies with post-composed
phenotypic descriptions presents new challenges since most of the involved
ontologies are developed independently and may perform a different
conceptualization for the same entities. Therefore, this integration may not always lead to
the expected and proper logical consequences [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ]. In this paper we present
first steps towards the logic-based assessment of the integration of phenotypic
descriptions with domain ontologies.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Method and preliminary results</title>
      <p>
        Our experiments have been based on a post-composed version (from now on
HPOPC) of the HPO ontology4 applying the method from [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The HPO ontology
only provides a classification of pre-composed phenotypic descriptions (e.g. see
left hand side of Figure 1), whereas HPOPC also provides explicit links to relevant
domain entities (see right hand side of Figure 1). HPOPC contains 11382 entities
and uses external concepts from different domain ontologies, including PATO [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
(264 concepts), Cell Ontology (12 conc.), GO (96 conc.), FMA [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] (812 conc.),
CHEBI (33 conc.), and other OBO foundry ontologies [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        A DL reasoner may be used to reclassify HPO concepts, according to the
knowledge of HPOPC and linked ontologies, and get new interesting knowledge.
However, as stated in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], reasoning with HPOPC and all linked ontologies is time
consuming. To smooth this limitation, we have extracted a locality-based module
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for each set of referenced external entities. For example, the module for FMA
contains 2044 concepts, which is much easier to reason with than the whole FMA
(around 80000 concepts). Thus, we have built HPOALL5 by merging HPOPC with
the corresponding modules from the referenced ontologies. The classification of
HPOALL using HermiT [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] takes around 45 seconds in a 2Gb laptop.
      </p>
      <p>
        New subsumption relationships between HPO concepts may represent both
desired new knowledge and unintended consequences. In order to evaluate the
new logical consequences hold in HPOALL we have borrowed the notion of logical
difference from [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. The logical difference between two ontologies contains the
set of consequences that are inferred in one of the ontologies but not in the
other. Unfortunately, there is no algorithm for computing the logical difference
in expressive DLs. Moreover, the number of inferences in the difference may
be infinite. Thus, we have reused the approximations of the logical difference
presented in previous work [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], where inferences are one of the following simple
kinds of axiom: (i) A B, (ii) A ¬B, (iii) A ∃R.B, (iv) A ∀R.B, and
v) R S (A, B are atomic concepts, including , ⊥, and R, S atomic roles).
4 Available from http://bioonto.de/obo2owl/hpo-in-owl.owl
5 We have converted the OBO ontologies to OWL using the OWLDEF method [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ],
and we have normalized the involved concept and property URIs
      </p>
      <p>The logical difference between HPOALL and HPOPC, affecting only HPO
concepts, contains 759 new subsumption relationships (inferences of type (i)). The
integration leads indeed to a reclassification of HPO concepts. For example,
HPOALL infers the probably non-intended consequence Generalized edema ≡
Edema which was not hold in HPOPC. As shown in the Prot´eg´e-like explanation
from Figure 2 the new knowledge from FMA leads to this new consequence.</p>
      <p>The logical difference also contains 80 new entailments that relate concepts
from domain ontologies (i.e. new cross-references). For example, the GO concept
Epidermis development is classified under the FMA concept Anatomical entity.
This consequence is probably not intended and it is due to the definition of
range axioms in FMA (see Figure 3) and the use of the property part of in
different scopes (in FMA relates anatomical entities, whereas in GO biological
processes). Additionally, if a greater approximation of the logic difference is
considered (i.e. entailments of type (ii)-(v)) new consequences are also obtained (e.g.
GO 0030308 ∃negatively regulates.GO 0040007, where GO 0030308 stands
for Negative regulation of cell growth and GO 0040007 stands for Growth.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusions and future work</title>
      <p>
        The benefits of integrating phenotypic descriptions with domain ontologies have
already discussed in the literature [
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. However, the consequences of the
integration should be evaluated by domain experts in order to detect potential
unintended consequences.
      </p>
      <p>
        In this paper we have performed a preliminary evaluation6 in which state
of the art techniques (e.g. ontology reasoning, ontology modularization, logical
difference) have been reused to extract the set of new consequences when
integrating post-composed phenotypic descriptions, such as the provided by HPOPC,
with domain ontologies. In a near future, we intend to develop a system to guide
the expert in the detection and repair of unintended consequences such as in our
previous tool ContentMap [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], in which we assessed the integration of ontologies
through mappings.
      </p>
      <p>
        Moreover, domain ontologies contains cross-references (i.e. mappings) which
have not been considered for this preliminary assessment. These new
correspondences will probably lead to new consequences that should be assessed. Thus,
we also intend to adapt the techniques proposed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] to this new setting.
6 HPOALL and related domain ontology modules
http://krono.act.uji.es/people/Ernesto/phenotypeassessment/
are
available
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Robinson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mundlos</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>The Human Phenotype Ontology</article-title>
          .
          <source>Clinical Genetics</source>
          <volume>77</volume>
          (
          <issue>6</issue>
          ) (
          <year>2010</year>
          )
          <fpage>525</fpage>
          -
          <lpage>534</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Lussier</surname>
            ,
            <given-names>Y.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Terminological mapping for high throughput comparative biology of phenotypes</article-title>
          . In: Pacific Symposium on Biocomputing. (
          <year>2004</year>
          )
          <fpage>202</fpage>
          -
          <lpage>213</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haendel</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Integrating phenotype ontologies across multiple species</article-title>
          .
          <source>Genome Biology</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          ) R2
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Hoehndorf</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oellrich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rebholz-Schuhmann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Interoperability between phenotype and anatomy ontologies</article-title>
          .
          <source>Bioinformatics</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Jim</surname>
          </string-name>
          <article-title>´enez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuenca Grau</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berlanga</surname>
          </string-name>
          , R.:
          <article-title>Ontology integration using mappings: Towards getting the right logical consequences</article-title>
          .
          <source>In: European Semantic Web Conference. Volume 5554 of LNCS</source>
          . (
          <year>2009</year>
          )
          <fpage>173</fpage>
          -
          <lpage>187</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Jim</surname>
          </string-name>
          <article-title>´enez-</article-title>
          <string-name>
            <surname>Ruiz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuenca Grau</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berlanga</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Logic-based assessment of the compatibility of UMLS ontology sources</article-title>
          .
          <source>Accepted for publication in Journal of Biomedical Semantics</source>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Gkoutos</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Green</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mallon</surname>
            ,
            <given-names>A.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hancock</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davidson</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Using ontologies to describe mouse phenotypes</article-title>
          .
          <source>Genome Biology</source>
          <volume>6</volume>
          (
          <issue>1</issue>
          ) (
          <year>2004</year>
          ) R8
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Mejino</given-names>
            <surname>Jr.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.L.V.</given-names>
            ,
            <surname>Rosse</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Symbolic modeling of structural relationships in the foundational model of anatomy</article-title>
          .
          <source>In: Proceedings of KR-MED</source>
          . (
          <year>2004</year>
          )
          <fpage>48</fpage>
          -
          <lpage>62</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ashburner</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosse</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bug</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldberg</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          , et al.:
          <article-title>The OBO foundry: coordinated evolution of ontologies to support biomedical data integration</article-title>
          .
          <source>Nature</source>
          biotechnology
          <volume>25</volume>
          (
          <issue>11</issue>
          ) (
          <year>2007</year>
          )
          <fpage>1251</fpage>
          -
          <lpage>1255</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Kazakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Sattler</surname>
          </string-name>
          ,
          <string-name>
            <surname>U.</surname>
          </string-name>
          :
          <article-title>Modular reuse of ontologies: Theory and practice</article-title>
          .
          <source>J. of Artificial Intelligence Research</source>
          <volume>31</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Hoehndorf</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oellrich</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kelso</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rebholz-Schuhmann</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herre</surname>
          </string-name>
          , H.:
          <article-title>Relations as patterns: bridging the gap between OBO and OWL</article-title>
          .
          <source>BMC Bioinformatics</source>
          <volume>11</volume>
          (
          <issue>1</issue>
          ) (
          <year>2010</year>
          )
          <fpage>441</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Motik</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shearer</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horrocks</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Hypertableau Reasoning for Description Logics</article-title>
          .
          <source>Journal of Artificial Intelligence Research</source>
          <volume>36</volume>
          (
          <year>2009</year>
          )
          <fpage>165</fpage>
          -
          <lpage>228</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Konev</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Walther</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>The logical difference problem for description logic terminologies</article-title>
          .
          <source>In: IJCAR</source>
          . Volume
          <volume>5195</volume>
          of LNCS., Springer (
          <year>2008</year>
          )
          <fpage>259</fpage>
          -
          <lpage>274</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>