Mapping WordNet to the Basic Formal Ontology
                              using the KYOTO ontology
                                                                Selja Seppälä 1∗
                                         1
                                             Department of Philosophy, University at Buffalo, USA


1     INTRODUCTION                                                           The KYOTO ontology (hereafter KYOTO) is part of a project
Ontologies are often used in combination with natural language            aimed at representing domain-specific terms in a computer-tractable
processing (NLP) tools to carry out ontology-related text                 axiomatized formalism to allow machines to reason over texts
manipulation tasks, such as automatic annotation of biomedical            in natural language (Vossen et al., 2010). It links WordNets
texts with ontology terms. These tasks involve categorizing relevant      of different languages to ontology classes, on the basis of a
terms from texts under the appropriate categories. This requires          mapping of the English WN to KYOTO. The approximately 2000
coupling ontologies with lexical resources. Several projects have         classes of KYOTO are subdivided into three layers: (1) The top-
realized these kinds of mappings with upper-level ontologies that         most layer is based on the Descriptive Ontology for Linguistic
are extended by domain-specific ontologies (Gangemi et al., 2010;         and Cognitive Engineering (DOLCE-Lite-Plus, version 3.9.7) and
Laparra et al., 2012; Niles and Pease, 2003; Pease and Fellbaum,          OntoWordNet (Gangemi et al., 2003). DOLCE shares a number
2010). However, no such resource is available for the Basic Formal        of relevant characteristics with BFO: domain neutrality; bi-partition
Ontology (BFO), which is widely used in the biomedical domain.1           into ‘endurants’ (CONTINUANTS) and ‘perdurants’ (OCCURRENTS);
   We describe and evaluate a semi-automatic method for mapping           strict hierarchical is a taxonomy; distinction between independent
the large lexical network WordNet 3.0 (WN) to BFO 2.0 exploiting          and dependent entities. (2) The second layer is composed of noun
an existing mapping between WN and the KYOTO ontology, which              and verb synsets constituting a set of Base Concepts (BCs). (3)
includes an upper-level ontology similar to BFO. Our hypothesis           The third layer contains domain-specific classes (e.g. from the
is that a large portion of WN, primarily nouns and verbs, can be          environmental domain).
semi-automatically mapped to BFO 2.0 types by means of simple
mapping rules exploiting another ontology already linked to WN.           3     MAPPING METHOD
                                                                          Our semi-automatic mapping method involves three main steps:
2     ONTOLOGICAL AND LEXICAL RESOURCES                                    1. Manually creating mappings:
The Basic Formal Ontology (BFO) is a domain-neutral upper-level                 • from KYOTO to BFO on the basis of existing mappings
ontology (Smith et al., 2012). It represents the types of things that              of DOLCE to BFO 1.0 and BFO 1.1 (Grenon, 2003; Khan
exist in the world and relations between them. BFO serves as an                    and Keet, 2013; Seyed, 2009; Temal et al., 2010), ignoring
integration hub for mid-level and domain-specific ontologies, such                 the axiomatization incompatibilities;
as the Ontology for Biomedical Investigations (OBI) and the Cell                  • from BFO 1.0 and BFO 1.1 to BFO 2.0 on the basis of
Line Ontology (CLO), which thus become interoperable (Smith                         work in Seppälä et al., 2014;
and Ceusters, 2010). BFO is subdivided into CONTINUANTS (e.g.,                    • from WN semantic labels to BFO 2.0.
OBJECTS and FUNCTIONS) and OCCURRENTS (e.g., PROCESSES
                                                                              2. Manually creating mapping rules using the above mappings
and EVENTS). Continuants can be either independent (e.g., physical
                                                                                 and extending them with more specific rules from other
OBJECTS like persons and hearts) or dependent (e.g., the ROLE of a
                                                                                 KYOTO types.
person as a physician and the FUNCTION of a heart to pump blood).
The most recent version, BFO 2.0, represents 35 types to which              3. Implementing the 33 resulting mapping rules in a Python
previous versions (BFO 1.0 and BFO 1.1) have been mapped in                    pipeline using the natural language toolkit for Python that
Seppälä et al., 2014.                                                        integrates WN 3.04 (NLTK 3.0).
   WordNet 3.0 is a large lexical network linking over 117000                The rules are of the form: ‘KYOTO/WN > BFO 2.0’, for
sets of synonymous English words (synsets) by means of semantic           example:
relations; it is widely used in NLP tasks (Fellbaum, 1998). Noun                 ‘#non-agentive-social-object > disposition’
and verb synsets are linked via the hypernym relation.2 WN 3.0                   ‘accomplishment > process’
distinguishes between types and instances, meaning named entities.               ‘noun.act > process’
It also links a subset of synsets to topic domains (e.g., ‘medicine’)       The implementation first lists all KYOTO types that subsume
and semantic labels (e.g., the ‘noun.artifact’ lexicographer file         a WN synset using the WN-KYOTO mapping data files.5 For
contains “nouns denoting man-made objects”3 ).                            example, the synset immunity.n.02 is linked to:

∗ To whom correspondence should be addressed: seljamar@buffalo.edu
                                                                          4  Natural Language Toolkit for Python (NLTK), version 3.0,
1 See http://ifomis.uni-saarland.de/bfo/users.                            http://www.nltk.org.
2 Adjectives and adverbs are linked by way of other semantic relations.   5 http://kyoto-project.eu/xmlgroup.iit.cnr.it/kyoto/index9c60.html?option=
3 See http://wordnet.princeton.edu/man2.1/lexnames.5WN.html.              com contentview=articleid=429Itemid=156


    Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes                                           1
Seppälä


‘Kyoto#condition__status-eng-3.0-13920835-n’,                            will be to provide BFO-compliant interpretations of unmatched WN
‘Kyoto#state-eng-3.0-00024720-n’,                                        synsets.
‘ExtendedDnS.owl#situation’,                                             6    CONCLUSION AND FUTURE WORK
‘ExtendedDnS.owl#non-agentive-social-object’,
                                                                         We presented a method to semi-automatically map WordNet 3.0
‘ExtendedDnS.owl#social-object’,
                                                                         synsets to BFO 2.0 types via the KYOTO ontology. Our preliminary
‘DOLCE-Lite.owl#non-physical-object’,
                                                                         results are encouraging, but more work is needed to see if the
‘DOLCE-Lite.owl#non-physical-endurant’,
                                                                         method scales to the full WN. Future work will include: extending
‘DOLCE-Lite.owl#endurant’,
                                                                         the evaluation set of medical synsets using hyponymy relations and
‘DOLCE-Lite.owl#spatio-temporal-particular’,
                                                                         other domain resources; carrying out more thorough evaluations,
‘DOLCE-Lite.owl#particular’
                                                                         e.g., by randomly extracting samples of synsets grouped by part
   Second, the mapping rules are applied starting from the more
                                                                         of speech; augmenting the mapping rules by exploiting other
specific ones (BFO leaf nodes): the program tests if a given string
                                                                         resources, e.g., WN-SUMO mappings and ontologies extending
(e.g., ‘#non-agentive-social-object’) matches a string in the
                                                                         BFO.
types list; if the strings match, the program assigns to that synset
the corresponding BFO 2.0 type (e.g., ‘disposition’). Thus, the
synset immunity.n.02 is categorized as referring to a subtype of         ACKNOWLEDGEMENTS
the BFO type DISPOSITION.                                                Work on this paper was supported by the Swiss National Science
                                                                         Foundation (SNSF). Thanks also to Christopher Crowner, Barry
4   EVALUATION AND RESULTS                                               Smith, and Alan Ruttenberg.
We manually evaluated the method on the 106 synsets in KYOTO
marked with a ‘medicine’ topic domain. 72% of the assigned BFO
types were correct (63% of the synsets were assigned the expected        REFERENCES
BFO type; 8% a superclass). As hypothesized, all the correctly           Fellbaum, C., editor (1998). WordNet: An Electronic Lexical Database. MIT Press,
                                                                            Cambridge, MA.
categorized synsets were nominal and verbal. 27% of the assigned
                                                                         Gangemi, A., Guarino, N., Masolo, C., and Oltramari, A. (2003). Sweetening WordNet
BFO types were incorrect (mostly adjectives). One synset was not            with DOLCE. AI magazine, 24(3), 13–24.
matched by any rule.                                                     Gangemi, A., Guarino, N., Masolo, C., and Oltramari, A. (2010). Interfacing WordNet
                                                                            with DOLCE: towards OntoWordNet. In C.-r. Huang, N. Calzolari, and A. Gangemi,
                                                                            editors, Ontology and the Lexicon: A Natural Language Processing Perspective,
5   DISCUSSION                                                              pages 36–52. Cambridge University Press.
WN is too large to be manually mapped to BFO. Using the                  Grenon, P. (2003). BFO in a Nutshell: A Bi-categorial Axiomatization of BFO and
properties of the hypernym hierarchy, we could have approached              Comparison with DOLCE. IFOMIS Report 06/2003. Technical report, Institute
                                                                            for Formal Ontology and Medical Information Science (IFOMIS), University of
the problem by mapping the top levels of WN to the relevant
                                                                            Leipzig, Leipzig, Germany.
BFO types, and propagating the mapped BFO types downwards.               Khan, Z. C. and Keet, C. M. (2013). Addressing issues in foundational ontology
However, WN’s organization fails to comply with basic ontological           mediation. In Proceedings of KEOD’13, pages 5–16, Vilamoura, Portugal.
principles (Gangemi et al., 2010). Moreover, that method would              SCITEPRESS.
only cover nouns and verbs, while KYOTO also includes adjectives.        Laparra, E., Rigau, G., and Vossen, P. (2012). Mapping WordNet to the Kyoto ontology.
                                                                            In LREC, pages 2584–2589.
   Mapping DOLCE to BFO is not trivial: their categories do not          Niles, I. and Pease, A. (2003). Linking Lexicons and Ontologies: Mapping Wordnet to
align in every case and are in some cases governed by different             the Suggested Upper Merged Ontology. In Proceedings of the IEEE International
axioms. The former is meant to capture our use of language and              Conference on Information and Knowledge Engineering, pages 412–416.
conceptualization of the world; the latter is a realist ontology and     Pease, A. and Fellbaum, C. (2010). Formal ontology as interlingua: The SUMO and
                                                                            WordNet linking project and global WordNet. In C.-r. Huang, N. Calzolari, and
excludes from its scope unicorns and other putative non-real entities.
                                                                            A. Gangemi, editors, Ontology and the Lexicon: A Natural Language Processing
However, these differences will not matter for our purposes here.           Perspective. Cambridge University Press.
Mapping WN to BFO is not trivial: WN represents linguistic usage;        Seppälä, S., Smith, B., and Ceusters, W. (2014). Applying the Realism-Based
BFO, entities in the world. WN thus includes synsets that, in BFO           Ontology-Versioning Method for Tracking Changes in the Basic Formal Ontology.
terms, do not refer (at all or to a BFO type, e.g. positive.a.04).          In 8th International Conference on Formal Ontology in Information Systems (FOIS
                                                                            2014), Rio de Janeiro, Brazil.
10 synsets in the evaluation set posed categorization issues.            Seyed, A. P. (2009). BFO/DOLCE Primitive Relation Comparison. In Nature
   Our solutions to these issues are: (1) to extend the coverage of         Precedings.
the rules by adding other types included in KYOTO and WN’s               Smith, B. and Ceusters, W. (2010). Ontological Realism: A Methodology for
semantic labels; (2) to ignore the axiomatizations. Indeed, this work       Coordinated Evolution of Scientific Ontologies. Applied Ontology, 5, 139–188.
                                                                         Smith, B., Almeida, M., Bona, J., Brochhausen, M., Ceusters, W., Courtot, M., Dipert,
is neither aimed at mapping DOLCE to BFO, nor at axiomatizing
                                                                            R., Goldfain, A., Grenon, P., Hastings, J., Hogan, W., Jacuzzo, L., Johansson, I.,
WN. Instead, we attempt to answer the question: to what types of            Mungall, C., Natale, D., Neuhaus, F., Rovetto, A. P. R., Ruttenberg, A., Ressler, M.,
entities do WN synsets refer? The resulting mappings are to be read         and Schulz, S. (2012). Basic Formal Ontology 2.0: DRAFT SPECIFICATION AND
as ‘a WN synset X refers to something that is a subtype of BFO type         USER’S GUIDE.
Y’, as in ‘the synset immunity.n.02 refers to a subtype of the BFO       Temal, L., Rosier, A., Dameron, O., and Burgun, A. (2010). Mapping BFO and
                                                                            DOLCE. Studies In Health Technology And Informatics, 160(Pt 2), 1065–1069.
type DISPOSITION’ — we exclude instances for now. Even a partial         Vossen, P., Rigau, G., Agirre, E., Soroa, A., Monachini, M., and Bartolini, R. (2010).
mapping should be sufficient to cover a large portion of WN, leaving        KYOTO: an open platform for mining facts. In Proceedings of the 6th Workshop on
a smaller subset of problematic cases. An interesting challenge             Ontologies and Lexical Resources, pages 1–10.


2                           Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes