Introduction

Using semantic relatedness and word sense disambiguation for (CL)IR

Eneko Agirre

0 1

Arantxa Otegi

arantza.otegig@ehu.es 0 1

Hugo Zaragoza

hugoz@yahoo-inc.com 1 2 0 IXA NLP Group, University of the Basque Country. Donostia , Basque Country 1 Robust Retrieval, CLIR, Word Sense Disambiguation , Lexical Relatedness, Document Expansion 2 Yahoo! Researech , Barcelona , Spain

In this paper we report the experiments for the CLEF 2009 Robust-WSD task, both for the monolingual (English) and the bilingual (Spanish to English) subtasks. Our main experimentation strategy consisted on expanding and translating the documents, based on the related concepts of the documents. For that purpose we applied a stateof-the art semantic relatedness method based on WordNet. The relatedness measure was used with and without WSD information. Even if we obtained positive results in our training and development datasets, we did not manage to improve over the baseline in the monolingual case. The improvement over the baseline in the bilingual case is marginal. We plan to further work on this technique, which has attained positive results in the passage retrieval for question answering task at CLEF (ResPubliQA).

Categories and Subject Descriptors

Introduction

Our goal is to test whether Word Sense Disambiguation (WSD) information can be bene cial for Cross Lingual Information Retrieval (CLIR) or monolingual Information Retrieval (IR). WordNet has been previously used to expand the terms in the query with some success [3, 4, 5, 7]. WordNetbased approaches need to deal with ambiguity, which proves di cult given the little context available to disambiguate the word in the query e ectively. In our experience document expansion works better than topic expansion (see our results of the last edition in [6]). Bearing this in mind, this edition we have mainly focused on documents, using a more elaborate expansion strategy. We have applied a state-of-the-art semantic relatedness method based on WordNet [1] in order to select the best terms to expand the documents. The relatedness method can optionally use the WSD information provided by the organizers.

The remainder of this paper is organized as follows. Section 2 describes the experiments carried out. Section 3 presents the results obtained. Finally, Section 4 draws the conclusions and mentions future work.

Experiments

Our main experimentation strategy consisted on expanding the documents, based on the related concepts of the documents. The steps of our retrieval system are the following. We rst expand translate the topics. In a second step we extract the related concepts of the documents, and expand the documents with the words linked to these concepts in WordNet. Then we index these new expanded documents, and nally, we search for the queries in the indexes in various combinations. All steps are described sequentially. 2.1

Expansion and translation strategies of the topics

WSD data provided to the participants was based on WordNet version 1.6. Each word sense has a WordNet synset assigned with a score. Using those synset codes and the English and Spanish wordnets, we expanded the topics. In this way, we generated di erent topic collections using di erent approaches of expansion and translation, as follows:

Full expansion of English topics: expansion to all synonyms of all senses.

Best expansion of English topics: expansion to the synonyms of the sense with highest WSD score for each word, using either UBC or NUS disambiguation data (as provided by organizers).

Translation of Spanish topics: translation from Spanish to English of the rst sense for each word, taking the English variants from WordNet.

In both cases we used the Spanish and English wordnet versions provided by the organizers. 2.2

Query construction

We constructed queries using the title and description topic elds. Based on the training topics, we excluded some words and phrases from the queries, such as nd, describing, discussing, document, report for English and encontrar, describir, documentos, noticias, ejemplos for Spanish.

After excluding those words and taking only nouns, adjectives, verbs and numbers, we constructed several queries for each topic using the di erent expansions of the topics (see Section 2.1) as follows:

Original words. Both original words and expansions for the best sense of each word. Both original words and all expansions for each word.

Translated words, using translations for the best sense of each word. If a word had no translation, the original word was included in the query.

The rst three cases are for the monolingual runs, and the last one for the bilingual run which translated the query. 2.3

Expansion and translation strategies of the documents

Our document expansion strategy was based on semantic relatedness. For that purpose we used UKB1, a collection of programs for performing graph-based Word Sense Disambiguation and lexical similarity/relatedness using a pre-existing knowledge base, in this case WordNet 1.6.

Given a document, UKB returns a vector of scores for each concept in WordNet. The higher the score, the more related is the concept to the given document. In our experiments we used di erent approaches to represent each document: 1The algorithm is publicly available at http://ixa2.si.ehu.es/ukb/ using all the synsets of each word of the document. using only the synset with highest WSD score for each word, as given by the UBC disambiguation data (provided by the organizers).

In both cases, UKB was initialized using the WSD weights: each synset was weighted with the score returned by the disambiguation system, that is, each concept was weighted according to the WSD weight of the corresponding sense of the target word.

Once UKB outputs the list of related concepts, we took the highest-scoring 100 or 500 concepts and expanded them to all variants (words in the concept) as given by WordNet. For the bilingual run, we took the Spanish variants. In both cases we used the Spanish and English wordnet versions provided by the organizers.

The variants for those expanded concepts were included in two new elds of the document representation; 100 concepts in the rst eld and 400 concepts in the second eld. This way, we were able to use the original words only, or also the most related 100 concepts, or the original words and the most related 500 concepts. We will get back to this in Section 2.4 and Section 2.5. 2.4

Indexing 2.5 Retrieval

We indexed the new expanded documents using the MG4J search-engine [2]. MG4J makes it possible to combine several indices over the same document collection. We created one index for each eld: one for the original words, one for the expansion of the top 100 concepts, and another one for the expansion of the following 400 concepts. Porter stemming was used as per usual. We carried out several retrieval experiments combining di erent kind of queries with di erent kind of indices. We used the training data to perform extensive experimentation, and choose the ones with best MAP results in order to produce the test topic runs.

The di erent kind of queries that we had prepared are those explained in Section 2.2. Our experiments showed that original words were getting good results, so in the test runs we used only the queries with original words.

MG4J allows multi-index queries, where one can specify which of the indices one wants to search in, and assign di erent weights to each index. We conducted di erent experiments, by using the original words alone (the index made of original words) and also by using one or both indices with the expansion of concepts, giving di erent weight to the original words and the expanded concepts. The best weights were then used in the test set, as explained in the following Section.

We used the BM25 ranking function with the following parameters: 1.0 for k1 and 0.6 for b. We did not tune these parameters.

The submitted runs are described in Section 3. 3

Results

{ EnEnAllSenses100Docs: original terms in topics; both original and expanded terms of 100 concepts, using all senses for initializing the semantic graph. The weight of the index that included the expanded terms: 0.25. bilingual without WSD: bilingual with WSD: { EnEnBestSense100Docs: original terms in topics; both original and expanded terms of 100 concepts, using best sense for initializing the semantic graph. The weight of the index that included the expanded terms: 0.25. { EnEnBestSense500Docs: original terms in topics; both original and expanded terms of 500 concepts, using best sense for initializing the semantic graph. The weight of the index that included the expanded terms: 0.25. { EsEnNowsd: translated terms in topics (from Spanish to English); original terms in documents (in English). { EsEn1stTopsAllSenses100Docs: translated terms in topics (from Spanish to English); both original and expanded terms of 100 concepts, using all senses for initializing the semantic graph. The weight of the index that included the expanded terms: 0.15. { EsEn1stTopsBestSense500Docs: translated terms in topics (from Spanish to English); both original and expanded terms of 100 concepts, using best sense for initializing the semantic graph. The weight of the index that included the expanded terms: 0.15. { EsEnAllSenses100Docs: original terms in topics (in Spanish); both original terms (in English) and translated terms (in Spanish) in documents, using all senses for initializing the semantic graph. The weight of the index that included the expanded terms: 1.00. { EsEnBestSense500Docs: original terms in topics (in Spanish); both original terms (in English) and translated terms (in Spanish) in documents, using best sense for initializing the semantic graph. The weight of the index that included the expanded terms: 1.60.

The weight of the index which was created using the original terms of the documents was 1.00 for all the runs.

Regarding monolingual results, we can see that using the best sense for representing the document when initializing the semantic graph achieves slightly higher results with respect to using all senses. Besides, we obtained better results when we expanded the documents using 500 concepts than using only 100 (compare the results of the runs EnEnBestSense100Docs and EnEnBestSense500Docs). However, we did not achieve any improvement over the baseline with neither WSD or semantic relatedness information. We have to mention that we did achieve improvement in the training data, but the di erence was not signi cant2.

2We used paired Randomization Tests over MAPs with =0.05

With respect to the bilingual results, EsEn1stTopsBestSense500Docs obtains the best result, although the di erence with respect to the baseline run is not statistically signi cant. This is different to the results obtained using the training data, where the improvements using the semantic expansion were remarkable. It is not very clear whether translating the topics from Spanish to English or translating the documents from English to Spain is better, since we got better results in the rst case in the testing phase (see runs called ...1stTops... in the Table 1), but not in the training phase.

In our experiments we did not make any e ort to deal with hard topics, and we only paid attention to improvements in Mean Average Precision (MAP) metric. In fact, we applied the settings which proved best in training data according to MAP. Another option could have been to optimize the parameters and settings according to Geometric Mean Average Precision (GMAP) values. 4

Conclusions and future work

We have described our experiments and the results obtained in both monolingual and bilingual tasks at Robust-WSD Track at CLEF 2009. Our main experimentation strategy consisted on expanding the documents based on a semantic relatedness algorithm.

The objective of carrying out di erent expansion strategies was to study if WSD information and semantic relatedness could be used in an e ective way in (CL)IR. After analyzing the results, we have found that those expansion strategies were not very helpful, especially in the monolingual task.

For the future, we want to analyze why we have not achieved higher gains using the semantic expansion, as the same strategy obtained remarkable improvements in the passage retrieval task (ResPubliQA).

Acknowledgments References

This work has been supported by KNOW (TIN2006-15049-C03-01) and KYOTO (ICT-2007211423). Arantxa Otegi's work is funded by a PhD grant from the Basque Government. [1] E. Agirre, A. Soroa, E. Alfonseca, K. Hall, J. Kravalova, and M. Pasca. A study on similarity and relatedness using distributional and WordNet-based approaches. In Proceedings of annual meeting of the North American Chapter of the Association of Computational Linguistics (NAACL), Boulder, USA, June 2009. [2] P. Boldi and S. Vigna. MG4J at TREC 2005. In Ellen M. Voorhees and Lori P. Buckland, editors, The Fourteenth Text REtrieval Conference (TREC 2005) Proceedings, number SP 500-266 in Special Publications. NIST, 2005. http://mg4j.dsi.unimi.it/. [3] S. Kim, H. Seo, and H. Rim. Information retrieval using word senses: Root sense tagging approach. In Proceedings of SIGIR, 2004. [4] S. Liu, F. Liu, C. Yu, and W. Meng. An e ective approach to document retrieval via utilizing wordnet and recognizing phrases. In Proceedings of SIGIR, 2004. [5] S. Liu, C. Yu, and W. Meng. Word sense disambiguation in queries. In Proceedings of ACM

Conference on Information and Knowledge Management (CIKM), 2005. [7] J.R. Perez-Aguera and H. Zaragoza. UCM-Y!R at CLEF2008 Robust and WSD tasks. In Working Notes of the Cross-Lingual Evaluation Forum, Aarhus, Denmark, 2008.