=Paper=
{{Paper
|id=None
|storemode=property
|title=Evaluation of a Semantic-oriented Approach to Cross-lingual Ontology Mapping
|pdfUrl=https://ceur-ws.org/Vol-674/Paper134.pdf
|volume=Vol-674
|dblpUrl=https://dblp.org/rec/conf/ekaw/FuBO10
}}
==Evaluation of a Semantic-oriented Approach to Cross-lingual Ontology Mapping==
Evaluation of a Semantic-Oriented Approach to Cross-
Lingual Ontology Mapping
Bo Fu, Rob Brennan, Declan O’Sullivan
Knowledge and Data Engineering Group, School of Computer Science and Statistics,
Trinity College Dublin, Ireland
{bofu, rob.brennan, declan.osullivan}@cs.tcd.ie
ABSTRACT
Most ontology mapping research has focused on the matching of
2. THE SOCOM FRAMEWORK
The semantic-oriented cross-lingual ontology mapping (SOCOM)
ontologies written in the same natural language, and developing
framework is designed specifically for cross-lingual mapping
tools and techniques that support this monolingual ontology
tasks carried out in multilingual environments. In doing so, it first
mapping process. However, as knowledge modelling is not
transforms one of the given ontologies into an equivalent of itself
restricted to the usage of a single natural language, mapping
that is labelled in the natural language used by the other(s), it then
systems must be able to operate upon ontologies that are labelled
applies existing monolingual matching techniques. The
in diverse natural languages. This paper outlines a semantic-
transformation of an ontology requires the translation of ontology
oriented cross-lingual ontology mapping framework that makes
labels from the source natural language to the target natural
use of several information sources to influence the selection of
language, whereby the notion of appropriate ontology label
ontology label translations in the process of generating high
translation (AOLT) is employed. An AOLT is a translation that is
quality mapping results, and presents a high-level overview of the
most likely to maximise the success of the subsequent
evaluation strategy of the proposed framework.
monolingual ontology matching step. The AOLT selection
process therefore is concerned with identifying the translations
Keywords that will most likely enhance the matching ability of monolingual
Cross-Lingual Ontology Mapping; Appropriate Ontology Label matching techniques, but not necessarily the translations that are
Translation; Multilingual Ontologies. linguistically most correct.
To achieve AOLT, several sources of information are used.
1. INTRODUCTION Firstly, the source ontology semantics are used to indicate the
Benjamins et al. [1] identify multilinguality as one of the great context of use for the to-be-translated resource labels. Given a
challenges for the semantic web, and point out that one way to certain position of a node, the labels of its surrounding nodes (i.e.
address this challenge is by providing assistance for the context) can be analysed. For example, for a class node, the labels
annotation of ontologies regardless of the natural languages used of its super/sub/sibling-classes can illustrate its context of use.
in them. However, to date, research in the field of ontology Secondly, since the source ontology is transformed so that it can
mapping has largely focused on the matching of ontologies be best mapped to the target ontology, the target ontology
labelled in the same natural language, where various monolingual semantics can be perceived as translation selection guidelines. For
ontology matching techniques have been developed as example, when several candidate translations are linguistically
documented by Euzenat & Shvaiko [2]. With ontologies being correct for a label, its AOLT is the one that is closest to what is
widely accepted as a knowledge management mechanism in used in the target ontology. Thirdly, mapping intent captures the
multilingual organisations [3] and used in a range of applications user’s motive in a CLOM scenario. For example, when working in
including machine translation [4], information retrieval (IR) [5] a highly refined domain such as medicine, achieving highly
and cross-lingual IR [6], multilinguality is increasingly evident in precise matches would be priority. Whereas when merging
ontologies. One way to enable knowledge discovery, sharing and knowledge repositories, gaining reasonable recall in the matches
reuse across natural language barriers in ontology-based systems generated may be desired. With known intent, the SOCOM
is by means of cross-lingual ontology mapping (CLOM). framework selects the most suitable translation source(s) in order
This paper proposes the semantic-oriented cross-lingual ontology to generate mappings with high precision and/or recall. Fourthly,
mapping (SOCOM) framework and presents a high-lever background knowledge can be drawn on the ontology domains
overview of its evaluation. which can be system specified or user specified. In other words,
encyclopedia or users can assist the AOLT process by providing
additional context of use. Fifthly, to draw on user expertise, the
SOCOM framework allows a user to specify preferred translation
Permission to make digital or hard copies of all or part of this work for sources and/or matching algorithms. Sixthly, mapping assessment
personal or classroom use is granted without fee provided that copies are
is used as a feedback mechanism in the SOCOM framework,
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy whereby statistics containing top-rated translation sources and/or
otherwise, or republish, to post on servers or to redistribute to lists, matching techniques are collected to aid the future execution of
requires prior specific permission and/or a fee. the framework. This feedback can be implicit or explicit. Implicit
EKAW 2010, October 11-15, 2010, Lisbon, Portugal. feedback is generated when the system assumes certain matches
are correct and identifies the most effective tools based on the large multilingual ontologies in English and German. These
assumption. Explicit feedback is generated by the users and is mapping results then enabled cross-lingual document retrieval of
more reliable. Seventhly, time constraints may limit the run time an adaptive personalised result composition and presentation
for the AOLT process. E.g., when rapid execution is desired, the system. Bilingual users can issue queries in German and retrieve
user can turn on/off certain features dynamically. Lastly, not all of relevant as well as personalised content in English. More details
the aforementioned resources will be always available to every of this information retrieval and composition system can be found
CLOM scenario. Resource constraints therefore may restrict the in [8].
level of sophistication of the AOLT selection process. Lastly, in all the experiments carried out, precision, recall and f-
measure scores were calculated to evaluate the quality of
3. EVALUATION STRATEGY mappings generated. In addition, statistic analysis, namely two-
A state of the art review is conducted first to identify current tailed t-tests were carried out on the score generated by the
approaches to CLOM. Through this review process, a generic SOCOM framework and the generic approach in order to validate
approach to CLOM was identified and implemented that uses off- the statistical significance of the experimental findings.
the-shelf machine translation tools and monolingual ontology
matching techniques. To investigate the effectiveness and to 4. ACKNOWLEDGMENT
identify potential limitations of this generic approach to CLOM, it This research is partially supported by Science Foundation Ireland
is evaluated in two CLOM scenarios involving ontologies written (Grant 07/CE/11142) as part of the Centre for Next Generation
in Chinese, English and French. These ontologies contain Localisation (http://www.cngl.ie) at Trinity College Dublin.
approximately one hundred entities and are of the semantic
research community and the bibliography domain. Results from
these experiments showed that mappings can be neglected by 5. REFERENCES
monolingual matching tools when entity labels are translated [1] Benjamins R. V., Contreras J., Corcho O., Gomez-Perez A.
independently from the ontologies of interest. When the 2004. Six Challenges for the Semantic Web. AIS SIGSEMIS
translations of ontology labels are carried out in isolation of the Bulletin, Vol. 1, Iss. 1, 2004.
CLOM tasks at hand, inadequate and synonymic translations can [2] Euzenat J., Shvaiko P. 2007. Ontology Matching. Springer-
introduce further complications to the subsequent monolingual Verlag Berlin/Heidelberg.
matching step. [3] Chang C., Lu W. 2002. The Translation of Agricultural
Based on this finding, the notion of appropriate ontology label Multilingual Thesaurus. In Proceedings of the 3rd Asian
translation arose. An initial framework prototype is implemented Conference for Information Technology in Agriculture
that makes use of the readily defined semantics of the given (Beijing, China, October 26-28, 2002), 526-528.
ontologies in a CLOM scenario. This prototype is evaluated [4] Shi C., Wang H. 2005. Research on Ontology-Driven
against the generic approach in the aforementioned CLOM Chinese-English Machine Translation. In Proceedings of
scenarios using the same multilingual ontologies and gold 2005 IEEE International Conference on Natural Language
standards. Experimental results showed that the SOCOM Processing & Knowledge Engineering (Wuhan, China,
framework generated higher quality mapping results than the October 30 – November 01, 2005), 426-430. DOI=10.1109/
generic approach due to its ability to select translations that are NLPKE.2005.1598775
similar to what were used by the target ontology in a specific
CLOM setting. [5] Guan J., Deng J, Qu Y. 2005. An Ontology-Driven
Information Retrieval Mechanism for Semantic Information
Motivated by this initial result, a second framework prototype was Portals. In Proceedings of 1st International Conference on
then designed and implemented to draw on additional inputs Semantic, Knowledge and Grid (Beijing, China, November
(discussed in section 2) in the AOLT selection process, effectively 27 - 29, 2005). SKG. IEEE Computer Society, Washington,
allowing fine tuning of the system. This second prototype is DC, 63. DOI= http://dx.doi.org/10.1109/SKG.2005.42
evaluated against the generic approach in the same CLOM
experiments involving the aforementioned multilingual [6] Zhang L., Wu G., Xu Y., Li W., Zhong Y. 2004.
ontologies. Various combinations of the AOLT influence sources Multilingual Collection Retrieving Via Ontology Alignment.
were executed in a range of experimental runs of the framework, In Proceedings of the 7th International Conference on Asian
and several sets of mappings were generated. Versatility in these Digital Libraries (Shanghai, China, December 13-17, 2004)
mapping results demonstrated the flexibility of the AOLT LNCS 3334, 939-957. DOI=10.1007/978-3-540-30544-6_57
selection mechanism and showcased the tuning ability of the [7] Şah M., Wade V. 2010. Automatic Metadata Extraction from
SOCOM framework. Multilingual Enterprise Content. In Proceedings of the 19th
Furthermore, as the experiments discussed above only concern ACM International Conference on Information and
ontologies of relatively small sizes, to assess the scalability of the Knowledge Management (Toronto, Canada, October 26-30,
framework, the second prototype was applied in a real-world 2010), to appear.
CLOM setting involving large organisational ontologies written in [8] Steichen B., Wade V. 2010. Adaptive Retrieval and
English and German. These ontologies contained over 7000 Composition of Socio-Semantic Content for Personalised
entities and were generated semi-automatically using enterprise Customer Care. In Proceedings of International Workshop
data of the technical customer support domain. More details of on Adaptation in Social and Semantic Web (Big Island of
how these ontologies are generated can be found in [7]. Mappings Hawaii, USA, June 21, 2010), 1-10, ISSN 1613-0073.
were then generated using the SOCOM framework between these