<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mapping WordNet to the Basic Formal Ontology using the KYOTO ontology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Selja Sepp a¨l a¨</string-name>
          <email>seljamar@buffalo.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Philosophy, University at Buffalo</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        Ontologies are often used in combination with natural language
processing (NLP) tools to carry out ontology-related text
manipulation tasks, such as automatic annotation of biomedical
texts with ontology terms. These tasks involve categorizing relevant
terms from texts under the appropriate categories. This requires
coupling ontologies with lexical resources. Several projects have
realized these kinds of mappings with upper-level ontologies that
are extended by domain-specific ontologies
        <xref ref-type="bibr" rid="ref11 ref3 ref6 ref7 ref8">(Gangemi et al., 2010;
Laparra et al., 2012; Niles and Pease, 2003; Pease and Fellbaum,
2010)</xref>
        . However, no such resource is available for the Basic Formal
Ontology (BFO), which is widely used in the biomedical domain.1
      </p>
      <p>We describe and evaluate a semi-automatic method for mapping
the large lexical network WordNet 3.0 (WN) to BFO 2.0 exploiting
an existing mapping between WN and the KYOTO ontology, which
includes an upper-level ontology similar to BFO. Our hypothesis
is that a large portion of WN, primarily nouns and verbs, can be
semi-automatically mapped to BFO 2.0 types by means of simple
mapping rules exploiting another ontology already linked to WN.</p>
    </sec>
    <sec id="sec-2">
      <title>ONTOLOGICAL AND LEXICAL RESOURCES</title>
      <p>
        The Basic Formal Ontology (BFO) is a domain-neutral upper-level
ontology
        <xref ref-type="bibr" rid="ref12">(Smith et al., 2012)</xref>
        . It represents the types of things that
exist in the world and relations between them. BFO serves as an
integration hub for mid-level and domain-specific ontologies, such
as the Ontology for Biomedical Investigations (OBI) and the Cell
Line Ontology (CLO), which thus become interoperable
        <xref ref-type="bibr" rid="ref11 ref8">(Smith
and Ceusters, 2010)</xref>
        . BFO is subdivided into CONTINUANTS (e.g.,
OBJECTS and FUNCTIONS) and OCCURRENTS (e.g., PROCESSES
and EVENTS). Continuants can be either independent (e.g., physical
OBJECTS like persons and hearts) or dependent (e.g., the ROLE of a
person as a physician and the FUNCTION of a heart to pump blood).
The most recent version, BFO 2.0, represents 35 types to which
previous versions (BFO 1.0 and BFO 1.1) have been mapped in
        <xref ref-type="bibr" rid="ref9">Seppa¨la¨ et al., 2014</xref>
        .
      </p>
      <p>
        WordNet 3.0 is a large lexical network linking over 117000
sets of synonymous English words (synsets) by means of semantic
relations; it is widely used in NLP tasks
        <xref ref-type="bibr" rid="ref1">(Fellbaum, 1998)</xref>
        . Noun
and verb synsets are linked via the hypernym relation.2 WN 3.0
distinguishes between types and instances, meaning named entities.
It also links a subset of synsets to topic domains (e.g., ‘medicine’)
and semantic labels (e.g., the ‘noun.artifact’ lexicographer file
contains “nouns denoting man-made objects”3).
      </p>
      <p>
        The KYOTO ontology (hereafter KYOTO) is part of a project
aimed at representing domain-specific terms in a computer-tractable
axiomatized formalism to allow machines to reason over texts
in natural language
        <xref ref-type="bibr" rid="ref14">(Vossen et al., 2010)</xref>
        . It links WordNets
of different languages to ontology classes, on the basis of a
mapping of the English WN to KYOTO. The approximately 2000
classes of KYOTO are subdivided into three layers: (1) The
topmost layer is based on the Descriptive Ontology for Linguistic
and Cognitive Engineering (DOLCE-Lite-Plus, version 3.9.7) and
OntoWordNet
        <xref ref-type="bibr" rid="ref2">(Gangemi et al., 2003)</xref>
        . DOLCE shares a number
of relevant characteristics with BFO: domain neutrality; bi-partition
into ‘endurants’ (CONTINUANTS) and ‘perdurants’ (OCCURRENTS);
strict hierarchical is a taxonomy; distinction between independent
and dependent entities. (2) The second layer is composed of noun
and verb synsets constituting a set of Base Concepts (BCs). (3)
The third layer contains domain-specific classes (e.g. from the
environmental domain).
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>MAPPING METHOD</title>
      <p>
        Our semi-automatic mapping method involves three main steps:
1. Manually creating mappings:
from KYOTO to BFO on the basis of existing mappings
of DOLCE to BFO 1.0 and BFO 1.1
        <xref ref-type="bibr" rid="ref10 ref13 ref4 ref5">(Grenon, 2003; Khan
and Keet, 2013; Seyed, 2009; Temal et al., 2010)</xref>
        , ignoring
the axiomatization incompatibilities;
from BFO 1.0 and BFO 1.1 to BFO 2.0 on the basis of
work in
        <xref ref-type="bibr" rid="ref9">Seppa¨la¨ et al., 2014</xref>
        ;
from WN semantic labels to BFO 2.0.
2. Manually creating mapping rules using the above mappings
and extending them with more specific rules from other
KYOTO types.
3. Implementing the 33 resulting mapping rules in a Python
pipeline using the natural language toolkit for Python that
integrates WN 3.04 (NLTK 3.0).
      </p>
      <p>The rules are of the form: ‘KYOTO/WN &gt; BFO 2.0’, for
example:
‘#non-agentive-social-object &gt; disposition’
‘accomplishment &gt; process’
‘noun.act &gt; process’</p>
      <p>The implementation first lists all KYOTO types that subsume
a WN synset using the WN-KYOTO mapping data files.5 For
example, the synset immunity.n.02 is linked to:
4 Natural Language Toolkit for Python (NLTK), version 3.0,
http://www.nltk.org.
5 http://kyoto-project.eu/xmlgroup.iit.cnr.it/kyoto/index9c60.html?option=
com contentview=articleid=429Itemid=156
‘Kyoto#condition__status-eng-3.0-13920835-n’,
‘Kyoto#state-eng-3.0-00024720-n’,
‘ExtendedDnS.owl#situation’,
‘ExtendedDnS.owl#non-agentive-social-object’,
‘ExtendedDnS.owl#social-object’,
‘DOLCE-Lite.owl#non-physical-object’,
‘DOLCE-Lite.owl#non-physical-endurant’,
‘DOLCE-Lite.owl#endurant’,
‘DOLCE-Lite.owl#spatio-temporal-particular’,
‘DOLCE-Lite.owl#particular’</p>
      <p>Second, the mapping rules are applied starting from the more
specific ones (BFO leaf nodes): the program tests if a given string
(e.g., ‘#non-agentive-social-object’) matches a string in the
types list; if the strings match, the program assigns to that synset
the corresponding BFO 2.0 type (e.g., ‘disposition’). Thus, the
synset immunity.n.02 is categorized as referring to a subtype of
the BFO type DISPOSITION.</p>
    </sec>
    <sec id="sec-4">
      <title>4 EVALUATION AND RESULTS</title>
      <p>We manually evaluated the method on the 106 synsets in KYOTO
marked with a ‘medicine’ topic domain. 72% of the assigned BFO
types were correct (63% of the synsets were assigned the expected
BFO type; 8% a superclass). As hypothesized, all the correctly
categorized synsets were nominal and verbal. 27% of the assigned
BFO types were incorrect (mostly adjectives). One synset was not
matched by any rule.</p>
    </sec>
    <sec id="sec-5">
      <title>5 DISCUSSION</title>
      <p>
        WN is too large to be manually mapped to BFO. Using the
properties of the hypernym hierarchy, we could have approached
the problem by mapping the top levels of WN to the relevant
BFO types, and propagating the mapped BFO types downwards.
However, WN’s organization fails to comply with basic ontological
principles
        <xref ref-type="bibr" rid="ref3">(Gangemi et al., 2010)</xref>
        . Moreover, that method would
only cover nouns and verbs, while KYOTO also includes adjectives.
      </p>
      <p>Mapping DOLCE to BFO is not trivial: their categories do not
align in every case and are in some cases governed by different
axioms. The former is meant to capture our use of language and
conceptualization of the world; the latter is a realist ontology and
excludes from its scope unicorns and other putative non-real entities.
However, these differences will not matter for our purposes here.
Mapping WN to BFO is not trivial: WN represents linguistic usage;
BFO, entities in the world. WN thus includes synsets that, in BFO
terms, do not refer (at all or to a BFO type, e.g. positive.a.04).
10 synsets in the evaluation set posed categorization issues.</p>
      <p>Our solutions to these issues are: (1) to extend the coverage of
the rules by adding other types included in KYOTO and WN’s
semantic labels; (2) to ignore the axiomatizations. Indeed, this work
is neither aimed at mapping DOLCE to BFO, nor at axiomatizing
WN. Instead, we attempt to answer the question: to what types of
entities do WN synsets refer? The resulting mappings are to be read
as ‘a WN synset X refers to something that is a subtype of BFO type
Y’, as in ‘the synset immunity.n.02 refers to a subtype of the BFO
type DISPOSITION’ — we exclude instances for now. Even a partial
mapping should be sufficient to cover a large portion of WN, leaving
a smaller subset of problematic cases. An interesting challenge
will be to provide BFO-compliant interpretations of unmatched WN
synsets.</p>
    </sec>
    <sec id="sec-6">
      <title>6 CONCLUSION AND FUTURE WORK</title>
      <p>We presented a method to semi-automatically map WordNet 3.0
synsets to BFO 2.0 types via the KYOTO ontology. Our preliminary
results are encouraging, but more work is needed to see if the
method scales to the full WN. Future work will include: extending
the evaluation set of medical synsets using hyponymy relations and
other domain resources; carrying out more thorough evaluations,
e.g., by randomly extracting samples of synsets grouped by part
of speech; augmenting the mapping rules by exploiting other
resources, e.g., WN-SUMO mappings and ontologies extending
BFO.</p>
    </sec>
    <sec id="sec-7">
      <title>ACKNOWLEDGEMENTS</title>
      <p>Work on this paper was supported by the Swiss National Science
Foundation (SNSF). Thanks also to Christopher Crowner, Barry
Smith, and Alan Ruttenberg.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Fellbaum</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , editor (
          <year>1998</year>
          ).
          <article-title>WordNet: An Electronic Lexical Database</article-title>
          . MIT Press, Cambridge, MA.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guarino</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masolo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Oltramari</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Sweetening WordNet with DOLCE</article-title>
          .
          <source>AI magazine</source>
          ,
          <volume>24</volume>
          (
          <issue>3</issue>
          ),
          <fpage>13</fpage>
          -
          <lpage>24</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guarino</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Masolo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Oltramari</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Interfacing WordNet with DOLCE: towards OntoWordNet</article-title>
          . In C.
          <article-title>-r.</article-title>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Calzolari</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Gangemi, editors,
          <source>Ontology and the Lexicon: A Natural Language Processing Perspective</source>
          , pages
          <fpage>36</fpage>
          -
          <lpage>52</lpage>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Grenon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>BFO in a Nutshell: A Bi-categorial Axiomatization of BFO and Comparison with DOLCE</article-title>
          .
          <source>IFOMIS Report 06/2003. Technical report, Institute for Formal Ontology and Medical Information Science (IFOMIS)</source>
          , University of Leipzig, Leipzig, Germany.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>Z. C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Keet</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          (
          <year>2013</year>
          ).
          <article-title>Addressing issues in foundational ontology mediation</article-title>
          .
          <source>In Proceedings of KEOD'13</source>
          , pages
          <fpage>5</fpage>
          -
          <lpage>16</lpage>
          , Vilamoura, Portugal. SCITEPRESS.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Laparra</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rigau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Vossen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Mapping WordNet to the Kyoto ontology</article-title>
          .
          <source>In LREC</source>
          , pages
          <fpage>2584</fpage>
          -
          <lpage>2589</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Niles</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Pease</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2003</year>
          ).
          <article-title>Linking Lexicons and Ontologies: Mapping Wordnet to the Suggested Upper Merged Ontology</article-title>
          .
          <source>In Proceedings of the IEEE International Conference on Information and Knowledge Engineering</source>
          , pages
          <fpage>412</fpage>
          -
          <lpage>416</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Pease</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Fellbaum</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Formal ontology as interlingua: The SUMO and WordNet linking project and global WordNet</article-title>
          . In C.
          <article-title>-r.</article-title>
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Calzolari</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Gangemi, editors,
          <source>Ontology and the Lexicon: A Natural Language Processing Perspective</source>
          . Cambridge University Press.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Seppa</surname>
            ¨la¨,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Applying the Realism-Based Ontology-Versioning Method for Tracking Changes in the Basic Formal Ontology</article-title>
          .
          <source>In 8th International Conference on Formal Ontology in Information Systems (FOIS</source>
          <year>2014</year>
          ), Rio de Janeiro, Brazil.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Seyed</surname>
            ,
            <given-names>A. P.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>BFO/DOLCE Primitive Relation Comparison</article-title>
          . In Nature Precedings.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Ontological Realism: A Methodology for Coordinated Evolution of Scientific Ontologies</article-title>
          .
          <source>Applied Ontology</source>
          ,
          <volume>5</volume>
          ,
          <fpage>139</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Almeida</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bona</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brochhausen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ceusters</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Courtot</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dipert</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Goldfain</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grenon</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hastings</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jacuzzo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Johansson</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Natale</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rovetto</surname>
            ,
            <given-names>A. P. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruttenberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ressler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Schulz</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <source>Basic Formal Ontology</source>
          <volume>2</volume>
          .0:
          <string-name>
            <given-names>DRAFT</given-names>
            <surname>SPECIFICATION AND USER'S GUIDE.</surname>
          </string-name>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Temal</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dameron</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Burgun</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>Mapping BFO and DOLCE</article-title>
          .
          <source>Studies In Health Technology And Informatics, 160(Pt 2)</source>
          ,
          <fpage>1065</fpage>
          -
          <lpage>1069</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Vossen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rigau</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agirre</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Soroa</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Monachini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Bartolini</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>KYOTO: an open platform for mining facts</article-title>
          .
          <source>In Proceedings of the 6th Workshop on Ontologies and Lexical Resources</source>
          , pages
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>