=Paper= {{Paper |id=Vol-1334/paper1 |storemode=property |title=Towards Constructing Linguistic Ontologies: Mapping Framework and Preliminary Experimental Analysis |pdfUrl=https://ceur-ws.org/Vol-1334/paper1.pdf |volume=Vol-1334 }} ==Towards Constructing Linguistic Ontologies: Mapping Framework and Preliminary Experimental Analysis== https://ceur-ws.org/Vol-1334/paper1.pdf
    Towards Constructing Linguistic Ontologies:
Mapping framework and preliminary experimental analysis

                               Mamoun Abu Helou

                           University Of Milano Biccoca
                        mamoun.abuhelou@disco.unimib.it



      Abstract. Cross-language ontology matching methods leverage trans-
      lation based approach in mapping ontological resources (e.g., concepts)
      lexicalized in different languages. Such methods are largely dependent on
      the structural information encoded in the matched ontologies. The pa-
      per presents an approach for large scale cross-language linguistic ontology
      matching. In particular, it shows how synsets belonging to two distinct
      language wordnets can be mapped, by means of machine translation and
      frequency analysis on word senses. A preliminary experimental analysis
      of the problem is presented, through which we conduct experiments on
      existing gold-standard datasets. The results are encouraging and show
      the feasibility of the approach and demonstrate that the sense-selection
      task is a crucial step towards high quality mappings. We conclude with
      our observations and draw our potential future directions.

      Keywords: ontology matching, linguistic ontology, wordnet, translation


1   Introduction
The last decades witnessed remarkable efforts in the development of ontolo-
gies that aim at capturing relationships between words and concepts aiming
to represent a commonsense knowledge in natural languages, and hence mak-
ing the semantic connections between words in different languages explicit, such
ontologies are often called wordnets or linguistic ontologies [11][13]. One such
commonly used linguistic ontology is the English WordNet [4].
The success of WordNet motivated the construction of similarly structured lex-
icons for individual and multiple languages (multi-language lexicons). These in-
clude, among others, EuroWordNet [27], MultiWordNet [19]. Recently, wordnets
for many languages have been constructed under the guidelines of Global Word-
Net Association. However, the manual method of constructing these ontologies
is expensive and time-consuming. Automatic construction of wordnets is another
method for building and linking wordnets. De Melo and Weikum [5] used bilin-
gual dictionaries to automatically provide equivalents in various languages for
the English WordNet synsets. However, translation tools might remove the lan-
guage barrier but not necessarily the socio-cultural one [2]. The main challenge
is to find the appropriate word sense of the translated word [17].
To enable semantic interoperability across different languages, ontology based
       Mamoun Abu Helou

cross-language matching was explored in the last years [23]. However, the cultural-
linguistic barriers [8][2] still need to be overcome in terms of the mapping process
and techniques, as well as to formally define the semantic of mappings that align
concepts lexicalized across different natural languages[1]. In addition, languages
do not cover exactly the same part of the lexicon and, even where they have com-
mon concepts, these concepts are lexicalized differently [11]. One of the problems
in creating linguistic ontologies via cross-language matching approach is that one
needs to map an unstructured or a weakly structured lexicon, to a structured
lexicon [13]. This introduce an extremely difficult and challenging matching prob-
lem for many reasons: (i) the lack of structural information, which is often used
in matching techniques [23] (ii) the large mapping space (e.g., WordNet(3.0) has
117659 concepts), (iii) the quality (uncertainty) and coverage of the translation
resources. We need to assume that all sense distinctions given by the translations
are correct and available in a translation resource, and (iv) the words ambiguity,
that is, we need to select the most commonly and accepted meaning (sense) of a
word [17]. The difficulties of this task rise due to the words polysemy (if a word
can convey more than one meaning) and synonymy (if two or more words can
convey the same meaning).
The research presented here aims to contribute to the construction of large lin-
guistic ontologies [12]. The idea is to semi-automate this process by first match-
ing source language concepts (synsets) to an established wordnet concepts in
a target language, and then deriving the semantic relations among the source
concepts using relations among concepts in the target wordnet. We argue that,
the resultant relations can provide an initial set of relations that can be man-
ually validated and corrected [1]. Before selecting and/or extending the more
appropriate existing cross-language mapping techniques, we need to be able to
compare alternative methods and to assess the quality of their output.
Contribution.
In this paper I introduce a semi-automatic mapping framework that tries to
map synsets in different languages by combining translation tools and word
sense disambiguation (WSD) into a hybrid task. I define a mapping algorithm
for constructing linguistic ontologies, through mapping unstructured concepts
to structured one, as a maximization problem [26] that retrieve top k mappings
from a set of sorted candidate mappings.
Since the creation of wordnets uses mappings among concepts expressed (lex-
icalized) in different languages, one of the research areas more relevant to the
ontology creation problem is cross-language ontology matching (CLOM) . CLOM
techniques [6] can play a crucial role in bootstrapping the creation of large lin-
guistic ontologies and, for analogous reasons, in enriching existent ontologies.
We also remark that the above considerations are general and can be reused for
different languages. We demonstrate this by considering benchmark datasets for
different pairs of languages. In particular, we discuss the results of investigating
and assessing the mapping of an unstructured set of synsets in one language to
an existent wordnet in different language. The proposed algorithm leverages on
translation tools and tries to map synsets lexicalized in one language (e.g., Ara-
                                Towards Constructing Linguistic Ontologies

bic) to their correspondence synsets in other language (e.g., English), different
translating settings were considered to investigate the appropriate translation
methods for obtaining the correct translation. the algorithm ranks the trans-
lated synsets in order to select the most appropriate senses. This is followed by
an experimental analysis and discussion. By this experiment we aim to respond
on the following questions: (i) what is the best translation tool in term of cov-
erage and correctness. (ii) what is the impact of the correct translation on the
sense disambiguation task. (iii) what is the impact of providing (partial/semi)-
structural knowledge among the source synsets. This paper will focus on the
first two questions while leaving the third question for more investigation in the
future work.
The rest of the paper is structured as follows. Section 2 overviews the related
work. Section 3 illustrates the mapping algorithm. In section 4 the experiment,
the evaluation settings, and a discussion about the obtained results are given.
In section 5 we conclude and outline the future steps.


2   Background and State of the Art

The last decade witnessed a wide range of ontology matching methods which
have been successfully developed and evaluated in OAEI [23]. The majority of
the proposed techniques in these systems have mainly focused on mapping be-
tween ontological resources that are lexicalized in the same natural language (so-
called, mono-language ontology matching, MOM). However, methods developed
for MOM systems cannot directly access semantic information when ontologies
are lexicalized in different natural languages [6]. There is a need for a method
that automatically reconciles information when ontologies are lexicalized in dif-
ferent natural languages [8].
Manual mapping (by experts) was used to generate and review the mappings
quality [14]. The mappings generated by such approaches are likely to be accu-
rate and reliable. However, this can be a resource consuming process specially
for maintaining large and complex ontologies. An unsupervised method was sug-
gested based on (non-parallel) bilingual corpora [16].This approach, as it happens
with most unsupervised learning methods, heavily relies on corpus statistics. De
Melo and Weikum [5] constructed a binary classification learning problem to au-
tomatically determine the appropriate senses among the translated candidates.
To create their classifier they used several scores that take into account struc-
tural properties as well as semantic relatedness and corpus frequency information
where used. The authors claimed that, this technique is imperfect in terms of
their quality and coverage of language-specific phenomena [5].
In general, to resolve the cross-lingual issue, a translation based approach [6] is
considered in order to transform the CLOM problem into a MOM one. These
systems deeply leverage the structural information derived from the mapped
ontologies. Furthermore, Spohr et al. [25] approach, like all supervised learning
methods, requires a significant number of labeled training samples and well-
designed features to achieve good performance.
         Mamoun Abu Helou

Another interesting work for resolving the cross-lingual issue exploits Wikipedia,
a collaborative and multilingual resource of world and linguistic knowledge. Has-
san and Mihalcea [9] developed a cross-lingual version of Explicit Semantic Anal-
ysis (ESA, [7]), CL-ESA. Similar works that used ESA in linking concepts across
different languages are also presented in [3] and [15]. WikiMatch, a matching sys-
tem presented in [10], searches Wikipedia pages (articles) titles for a given term
(e.g., the ontology labels, and comments) and retrieve all language links describ-
ing the term, making use of the inter-lingual links between Wikipedia pages.
However, such approaches are limited and highly dependent on the lexical cov-
erages which are provided by Wikipedia inter-lingual links.
A notable approach for disambiguating and linking cross-lingual senses was pre-
sented in BabelNet [18]. Since Wikipedia inter-lingual links do not exist for all
Wikipedia pages, Navigli and Ponzetto [18] proposed a context-translation ap-
proach. They automatically translate (using Google translator) a set of English
sense-tagged sentences. After they applied the automatic translation, the most
frequent translation is detected and included as a variant for the mapped senses
in the given language. However, it is not clear if they employed any specific NLP
techniques in this process, or if they aligned the translated words with words in
the original (English) sentence (cf, the word aligner KNOWA [20]). Moreover,
such frequency counts do not necessarily preserve the part of speech of the trans-
lated words. They only translated Wikipedia entries whose lemmas (page titles)
do not refer to named entities 1 . They contextless translated lemmas in WordNet
which are monosemous (i.e., words that have only one meaning), they simply
included the translations returned by Google Translate. They [18] reported that
the majority of the translated senses are monosemous.
BabelNet [18] mappings were evaluated against manually mapped random sam-
ple and gold-standard datasets. However, it is important to understand if the
obtained mappings were achieved through the context- or contextless-translation
approaches. More importantly, the monosemous and polysemy translated senses
were not quantified in their evaluation, noticing that monosemous senses form a
substantial large portion of the evaluated wordnets. It is important to measure
both the contribution and the quality of the context-translation approach. For
instance, the Italian wordnet contains about 66% monosemous senses, while in
BabelNet they covered less than 53% and 74% of the Italian WordNet’s [19]
senses (words) and synsets, respectively. They determined the coverage as the
percentage of gold-standard synsets that share a term (one synonym overlap)
with the correspondence synset obtained from the mapping algorithm. This does
not necessarily implies a high quality mappings. It worth noting that BabelNet
context-translation approach covers about 25% of the Arabic WordNet’s [22]
words (which was used as benchmark).




1
    About 90% of Wikipedia pages are named entities which were included directly in
    BabelNet without translation [18]
                                Towards Constructing Linguistic Ontologies




3   Mapping Algorithm

Given a pair of wordnets wnL1 and wnL2 in different languages (L1 6= L2 ),
respectively called the source and target wordnet (we call it source wordnet al-
though it has no relations among the synsets), the mapping algorithm tries to
find for each synset synL1 ∈ wnL1 an equivalent synset from the target wordnet
synL2 ∈ wnL2 , such that it maximizes the probability of providing an appropri-
ate correspondence to synL1 . In order to map the synsets, we make use of the
mapping algorithm whose pseudocode is presented in Algorithm 1. The follow-
ing steps are performed for each synset synL1 ∈ wnL1 :
line 3: the translation function trans lookup all the possible translations in the
target language for each word in the source synset, wiL1 ∈ synL i .
                                                                 1

                                      L        L        L
line 4: the sense function (sense(w ) = {syn1 , .., synn }) lookup all the candi-
date senses, candSenses, from the target wordnet for each translated word.
line 5: the rank function accepts the candSenses and returns a ranked senses
(rSenses) ordered by the most appropriate senses, this is performed by counting
the frequency of each sense in the candSenses set, the rank function also gives a
higher priority (weight) to the synsets which obtained from translating different
words in the source synset (i.e., synonym words give the same translation).
line 6: the select function selects the top k mappings (a set of mappings) from
the (rSenses). For a single W SD mapping task k = 1. If a tie occurs, the senses
will be selected based on the higher ratio between the number of translated
words and the number of synonym words in the candidate senses (i.e., the ratio
between the size of the mapped synsets), else it is randomly selected. As a result
of executing the algorithm, a set of ranked mappings is returned.


4   Experiment

Wordnets. Three wordnets have been used in the experiment; the Arabic Word-
Net (2.0) [22], the Italian component of the MultiWordNet [19], and the English
WordNet (3.0) [4]. The wordnets, respectively, have 11214, 34728 and 117659
synsets, such that each wordnet has 15964, 40178 and 155287 words, this forms
23481, 61558 and 206941 total word senses, respectively for each wordnet.
The Arabic WordNet and the Italian WordNet were used to benchmarked the
mapping algorithm. Moreover, details on each wordnet are provided.
- The Arabic WordNet (ArWN) consists of 11214 Arabic synsets which were
constructed from 15964 vocalized Arabic words (13867 unvocalized word; i.e.,
         Mamoun Abu Helou

without diacritics). The ArWN authors [22] mapped the Arabic synsets to their
equivalent synsets in the English WordNet (2.0). As a result 10413 synsets have
been manually mapped to the correspondent English WordNet (EnWN) synsets.
Out of the 10413 Arabic-English mapped synsets, 54 mappings do not have Ara-
bic lexicon ( Arabic concepts without lexicalization, such concepts were consid-
ered as lexical gaps [27]), and 10 synsets have no English correspondence synsets
in EnWN (3.0) due to the part of speech mismatching. Overall, the resulted map-
pings are 10349 Arabic-English equivalent synsets mappings.
- The Italian WordNet (ItWN) 2 authors [19] have manually mapped the Italian
synsets to their equivalent synsets in the EnWN (1.6). Later under the Open
Multilingual Wordnet initiative the ItWN was mapped to EnWN (3.0). As a re-
sult we have 34728 Italian-English equivalent synsets mappings including 997 (
lexical gaps), mappings do not have the correspondence Italian lexicon. Overall,
the resulted mappings were 33731 Italian-English equivalent synsets mappings.

Evaluation. The goal is to disambiguate the senses and to find the appropriate
mappings between synsets lexicalized in different languages. The disambiguation
task was evaluated using evaluation measures borrowed from the information re-
trieval field [17]. The coverage is defined as the percentage of mappings in the
test set for which the mapping method provides a sense mapping. The precision
of the mapping method is computed as the percentage of the correct mappings
given by the method, this reflects how good are the mappings which obtained
by the assessed mapping method. The recall is defined as the ratio between the
number of the correct mappings provided by the method being assessed, and the
total number of mappings to be provided (number of mappings in the test set;
the benchmark dataset). The F − measure is also used, the F-measure is defined
as the weighted harmonic mean of precision and recall.
In addition, a Baseline (lower bound) setting was used; the first sense heuristic
was used to measure the lower bound of the experiment. Note that in general
most of the WSD algorithms hardly can beat these bounds [17]. The upper
bound (having the correct translations, Oracle translation) was also computed
in order to specify the highest expected performance of the proposed approach.

Translation Methods. In the experiment we used different resource for trans-
lation. (i) Machine translation tools: Google translator was used to obtain the
English translation for all the Arabic and Italian words in the benchmark dataset.
(ii) Machine readable dictionaries: Sina dictionary was used for Arabic to English
translations, Sina dictionary is a result of the ongoing Arabic Ontology project
[12], the dictionary was constructed by integrating several specialized and gen-
eral domain dictionaries. An Italian to English translation dictionary should be
considered in the future work. (iii) Oracle translation: An oracle is a hypothetic
system which is always supposed to know the correct answer (i.e., the correct
translation). We used the translations provided in the benchmark wordnets as
an oracle (correct translation). An oracle translation was used to demonstrate
2
    http://multiwordnet.fbk.eu/
                                Towards Constructing Linguistic Ontologies




                        Fig. 1. Overall experiment results.



the upper bounds. Moreover, (iv) an extended-oracle translation was obtained
for the Arabic-English translations by considering all the synonyms of the trans-
lated word, not only the translations provided in the ArWN. For the Italian
words one can only obtain the extend-oracle translation from the ItWN dataset.
Finally,(v) all dictionaries: the translations above were combined to investigate
the different translation resources accuracy.

Results. The results are reported in Figure 1, the ”Experiment” column speci-
fies the translation method. The evaluation measures were reported, the reported
measures evaluate (as %) if the equivalent mappings are among the top k map-
pings, k ∈ [1, 100]. The lower bounds (baseline) experiments were also reported.
In Figure 2, four variants that exploit the structural information of the tar-
get wordnet were considered to select the equivalent mappings: (1)isEquivalent
(isCorrect): the correct equivalent mappings appear among the top k candidate
synsets. (2)isHypernym: the candidate synset is a hyponym of the correct map-
ping. (3)hasHypernym (or isHyponym): the hypernym of the candidate synset
is an equivalent mapping. (4)isSister : the candidate synset is a sister node of a
equivalent mapping (has the same hypernym synset). Figure 2, also shows the
upper and lower bounds, and the precision of the obtained mappings using the
Google translator for the ArWN synsets (Figure 1, experiment No.1).

 Discussion. In experiment No.2, (Sina & Google translation) although it has a
similar coverage to (experiment No.1) about 100 mappings more were covered-
due to the correct translations by using Sina dictionary. From experiment No.3
(Oracle translation) we can notice that the missed mappings are only the lexical
gaps synsets. From experiment No.4 (Extended oracle translation) the impor-
        Mamoun Abu Helou




     Fig. 2. Result of Using Google translation for mapping ArWN-to-EnWN.


tant observation is that as more accurate translation we have, the better we can
rank and select the candidate senses, that is, the different between using the
oracle translation and adding the synonyms words is obtaining the equivalent
mappings with lower value of k.
Experiment No.5 (All dictionaries) highlights the fact that even we have the cor-
rect translations; the existence of inaccurate translations (by combining Google
translations and the domain specific translations in Sina dictionary) introduced
some noises and rise the ambiguity. This ranked the correct sense in lower po-
sitions, and increases the needed value of k to find the correct mappings. This
means it is important to filter and discriminate the incorrect translations at first
and then perform the ranking and selection steps. In the Baseline experiments
all the performed experiments outperformed the lower bound settings.
Having the correct translation is an important step, still selecting the correct
senses is a crucial task; One can notice from the results that although we have
high (about 100%) coverage, still we need to consider a high value of k to ob-
tain these correct mappings. In fact, the goodness and the better performance
of the mapping algorithm is to provide the correct mappings while minimiz-
ing the value of k, and this depends on the rank function. Moreover, consid-
ering the structural information in the target wordnet (EnWN) improves the
results. For instance, in experiment No.1 (Google dictionary) considering the
neighbor nodes (e.g.,isSister ) improves the results around 10% for small k val-
ues, k ∈ [5, 20]. For that, we believe that exploiting the structural information
(e.g., graph based and the similarity based approaches [21]) is expected to help in
ranking the correct senses in higher positions. Considering the candidate synsets
hypernyms(hasHypernym plot, Figure2) is also expected to improve the results.
                                Towards Constructing Linguistic Ontologies



Error Analysis. The main reasons for missing some mappings can be divided
into the following errors:
(a) Incorrect translation, number of mappings were missed due to the incorrect
translation, this can be divided into 4 errors categories (i) compound words, (ii)
named entity, (iii) translation (word) do not exist in WordNet, and (iv) bad
translation, by bad translation we mean that Google translator only re-spell
the source word in English alphabet. For instance, regarding the ArWN-EnWN
mappings, out of the 988 missing mappings (about 10% of the mappings) 760
are compound (one) word synsets, and 33 mappings are named entity synsets.
Moreover, 799 mappings do not have an equivalent EnWN lemma, and 189 map-
pings were considered as bad translations.
(b) Monosemous words, the ArWN has 4197 one word synsets, and it has 4370
monosemous synsets (all words in the synsets are monosemous). 2111 synsets are
compound (one) word synsets. The majority of the compound one words (about
94%) which used in mapping the Arabic and the English synsets are monose-
mous synsets. Due to this fact, about 10% of the ArWN-EnWN mappings have
only one candidate sense, and about 24% have up to 10 candidate senses. Re-
garding the ItWN-EnWN mappings, about 35% of the mappings have only one
candidate sense, this high values is due to the fact that most of the ItWN synsets
(about 66%) are monosemous synsets. About 74% synsets have up to 10 can-
didate senses. By studying the distribution of the rank values for the Google
translation for both the ArWN and ItWN, about 85% of the candidate senses
were equally ranked with one (the sense appear one time in the candSenses),
thus the increment of k retrieves more correct equivalent mappings.
(c) Polysemous words, the ArWN has 3641 polysemous words, such that about
30% and 16% of the polysemous synsets are respectively, two and three words
synsets. The ItWN has 10366 polysemous words, such that about 64% and 20%
of the polysemous synsets are respectively, two and three words synsets. This
highlights the fact that most of the benchmark polysemous sysnets are small
size (number of words), this make it more difficult to distinguish the correct
sense. For instance, in experiment No.4, when the synonym translations were
considered (we increased the size of the translated synsets) the raking functions
performs better than experiment No.3.


5   Conclusions & Future Works

This paper investigated how synsets from different languages can be mapped,
especially the impact of translation tools and the selection of candidate synsets
for mappings. A cross language mapping algorithm was presented, the algorithm
tries to maximizes the probability that mappings with higher rank are consid-
ered correct mappings by users, based on the frequency of translated synsets
and majority voting approaches. The experiments have demonstrated several
outcomes that can summarized as follow:
        Mamoun Abu Helou

 1. the approach was successfully tested over two different pairs of languages,
    which demonstrates its adoptability across different languages.
 2. using structural information encoded in the target wordnet improves the
    sense selection task.
 3. combining the translations of Machine Translation (MT) tools with a bilin-
    gual dictionary translation improves the results (Figure 1, experiment No.2).
 4. the proposed approach outperforms the baseline settings. The upper bounds
    indicate that there is a space for more improvements in terms of obtaining
    the correct translations and to better rank the candidate senses.

Moreover, features obtained from the MT tools (Google translation) such as the
translation score and the synset translations need to be explored in order to
filter the correct translations and to better rank the candidate senses. In addi-
tion, NLP techniques (e.g., stemming, headword extraction, ..etc) are expected
to improve the MT coverage, and obtain more candidate senses (instead of using
pure translation-lookup and word-sense exact matching).
Currently I am planning to consider other languages datasets, and to investi-
gate the construction of partial structural source synsets and its impact on the
mapping algorithm inspired by the work presented in [21]. Another interesting
direction is to crowdsource the construction process by providing the workers
with the top k mappings [24], at the same time this should simulates the major-
ity of speakers agreement [1].


Acknowledgments. This work was funded by EU FP7 SIERA project.


References

1. M. Abu Helou, M. Palmonari, M. Jarrar, Ch. Fellbaum. Towards Building Linguistic
   Ontology via Cross-Language Matching. GWN 2014
2. P. Cimiano, E.Montiel-Ponsoda, P. Buitelaar, M. Espinoza, A. Gmez-Prez. A note
   on ontology localization. Applied Ontology. 2010.
3. P. Cimiano, A. Schultz, S. Sizov, P. Sorg, S. Staab, Explicit vs. latent concept models
   for cross-language information retrieval, in: Proc. of the 21st IJCAI,2009.
4. Ch. Fellbaum. Wordnet: An electronic lexical database. MIT Press,1998.
5. G. De Melo and G. Weikum. Constructing and utilizing wordnets using statistical
   methods. Language Resources and Evaluation,2012.
6. B. Fu, R. Brennan, and D. O’Sullivan. A configurable translation-based cross-lingual
   ontology mapping system to adjust mapping outcomes. J. Web Sem., 2012.
7. E. Gabrilovich, S. Markovitch, Computing semantic relatedness using Wikipedia-
   based explicit semantic analysis, in: Proceedings of the 20th IJCAI, India, 2007.
8. J. Gracia, E. Montiel-Ponsoda, P. Cimiano, A. Gmez-Prez, P. Buitelaar, J. McCrae,
   Challenges for the multilingual web of data, JWS, 2012.
9. S. Hassan and R. Mihalcea, Cross-lingual Relatedness using Encyclopedic Knowl-
   edge, to appear in Proc. EMNLP 2009.
10. S. Hertling and H. Paulheim. WikiMatch - Using Wikipedia for Ontology Matching.
   In Proceedings OM, 2012.
                                Towards Constructing Linguistic Ontologies

11. G. Hirst, Ontology and the Lexicon, in Handbook on Ontologies and Information
   Systems. eds. S. Staab and R. Studer. Heidelberg: Springer, 2004.
12. M. Jarrar, Building A Formal Arabic Ontology (invited paper), Proc. of the Ex-
   perts Meeting On Arabic Ontologies And Semantic Networks. Alecso, Arab League.
   Tunis, July 26-28, 2011.
13. M. Jarrar, Lexical Semantics and Multilingualism. Lecture Notes, Sina Institute,
   Birzeit University, 2013.
14. A. Liang, M. Sini, Mapping AGROVOC & the Chinese Agricultural Thesaurus:
   Definitions, Tools Procedures. New Review of Hypermedia & Multimedia, 2006.
15. F. Narducci, M. Palmonari, and G. Semeraro. Cross-language semantic matching
   for discovering links to e-gov services in the LOD cloud . Know@LOD,ESWC, 2013.
16. G. Ngai, M. Carpuat, P. Fung, Identifying Concepts Across Languages: A First
   Step towards A Corpus-based Approach to Automatic Ontology Alignment. In:
   Proceedings of the 19th COLING, 2002.
17. R. Navigli. Word Sense Disambiguation: a Survey. ACM Computing Surveys, 2009.
18. R. Navigli and S. Ponzetto. BabelNet: The Automatic Construction, Evaluation
   and Application of a Wide-Coverage Multilingual Semantic Network. AI. 2012.
19. E. Pianta, L. Bentivogli, C. Girardi, MultiWordNet: Developing an aligned multi-
   lingual database, in: Proceedings of the 1st International GWC. 2002.
20. E. Pianta, L. Bentivogli, Knowledge Intensive Word Alignment with KNOWA. In
   Proceedings of COLING 2004.
21. M. T. Pilehvar and R. Navigli. A Robust Approach to Aligning Heterogeneous
   Lexical Resources. Proc. of the ACL 2014
22. H. Rodrguez, D. Farwell, J. Farreres, M. Bertran, M. Alkhalifa, M. Antonia Mart,
   W. Black, S. Elkateb, J. Kirk, A. Pease, P. Vossen, Ch. Fellbaum. Arabic WordNet:
   Current State and Future Extensions in: Proc. of the GWC 2008.
23. P. Shvaiko and J. Euzenat. Ontology matching: State of the art and future chal-
   lenges. IEEE Trans. Knowl. Data Eng 2013.
24. C. Sarasua, E. Simperl, N.F. Noy: CrowdMAP: Crowdsourcing Ontology Align-
   ment with Microtasks. In: Cudre-Mauroux, P., et al. (eds.) ISWC 2012, Part I.
   LNCS, Springer, Heidelberg (2012)
25. D. Spohr, L. Hollink, and Ph. Cimiano. A machine learning approach to multilin-
   gual and cross-lingual ontology matching. ISWC 2011.
26. Venetis, Petros and Garcia-Molina, Hector and Huang, Kerui and Polyzotis, Neok-
   lis. Max Algorithms in Crowdsourcing Environments. Proc. WWW 2012.
27. P. Vossen. EuroWordNet: a multilingual database of autonomous and language-
   specific wordnets connected via an Inter-Lingual-Index. IJL 2004.